[DRAFT] generators debugging #875

piotrm0 · 2024-02-08T02:17:12Z

Debugging generator serialization issues responsible for at least two reported Issues.

* version bump combine nb to docs * hotfix

* first try * starting feedback imp tests * working on feedback tests * nits * more tests * more adjustments * disable unit tests for now * added in-domain tests variants to run for now

Co-authored-by: Shayak Sen <[email protected]>

* if groundedness output is not list, set as list so agg functions properly * fix 0 resolving to null then -1 bug * fix groundedness measure error, move warning from init to only renamed method

* Add groundedness to Pinecone notebook * Fix definition per suggestion --------- Co-authored-by: Josh Reini <[email protected]>

* add langchain multi-retrieval agents + chroma vector mgr example * basic one feedback function working e2e * fix deps version * add response length custom feedback func * update notebook with markdown + more feedback functions + deferred mode add markdown and colab widget remove ckp

* Update dependencies for Pinecone example * Format notebook (isort & yapf)

* split off work on threading issues * work on dummy example * prototyping various thread robustness solutions * work * working on threading and feedback results * more ignores * nevermind that last gitignore addition * remove unneeded * added feedback result retrieval into langchain quickstart * don't use submit inside feedback functions * what the last thing said

Co-authored-by: piotrm0 <[email protected]>

* updating JSONPath, renamed to Lens * added storage of paths as strings instead of json structures * python parsing variations * more parsing variants for python 3.11 * makefile * version bounds format --------- Co-authored-by: Josh Reini <[email protected]>

* handle no pii * to do on error handling * logger.debug no pii found

* fix link, add trucustom * finish update * Update basic_instrumentation.ipynb remove duplicated text

* fix * fixes * component fixes * make backwards compatible * remove print message * fix another old __call__ usage meaning "get" * remove dist form gitignore * include dist * typing fix

* creating quickstart notebook for appui * fixing some runner bugs and adding info to quickstart * add screenshots to repo * clear output * unneeded try

* version bump to 0.16.0 * combine nb to docs

This reverts commit cacdbff.

This reverts commit ef8297a.

This reverts commit affab95.

* dedup * delete dups * another try * make one assets and images folder instead of two --------- Co-authored-by: Josh Reini <[email protected]>

Co-authored-by: joshreini1 <[email protected]>

* Update use_cases_production.md update wrong descriptions for azure/aws * switch to new deferred example

* Save working code for feedback direction * Use supplied_name as key if any for direction lookup * Fix direction lookup for cell style * Use OpenAI moderation category score as is (lower score is better) * Update OpenAI moderation API test cases

* added SummEval test generator groundedness smoke test for huggingface NLI model and open ai remove extra files clean up clean up * addressed pr comments * removed sqlite * rename f_groundtruth to f_mae * link docs * fixed local path * added gpt-3.5-turbo vs gpt-4 * numeric differenc to mae * to mae, groundedness notebook * clear noisy cell output * allow messages and prompt args * answer relevance smoke tests updated * remove cot versions * re-run context relevance with mae * update function definitions docs with link, fix broken link * fix symlink * fix function definitions links * remove cot definitions * small nits to md --------- Co-authored-by: Josh Reini <[email protected]>

* Improve cot reasoning Add prompting to influence llm to tie reasons back to the evaluation being performed * Update prompts.py Add criteria and tie it back language to cot template * update docstring to include reasons template * undo * Update prompts.py revert template back to supporting evidence * reason filtering only to supporting evidence * revert docstring * small change * update extract score/reasons * remove unneeded line. extract_score/reason not used for groundedness * add higher is better, fix title for moderation notebook

* update azure example, also show provider extension * update trulens version * remove force dashboard * add score-only example * custom feedback docs * refactor for user-facing generate_score and generate_score_and_reasons methods * update azure example with more user-friendly methods * update bedrock with more user-friendly methods * user friendly methods for provider extension * types in docstring

* first * CI update * global import static test * disable py312 test * add nltk optional fix * remove unrelated * add format and python bound * nit * adjust matrix cell name for ordering * spec * disable tests when optional package not installed * adjust ci script * syntax issues * typo * test * remove more * newline * name first * comment * nit * matrix no sequences * remove duplicate matrix cell * displayname again * unexpected symbol * hmm * fix test import * optional tests marking * imports tests * few more ellipses * remove non-existant import * expected error * fixing optionals * message on subtest * add subtests requirement * fix discovered import bugs * more fixes * adjust ipython requirement * downgrade bound even more for python 3.9 * more fixes * ipython fix * remove ipykernel installation in unit tests * change pr job name * name * more informative tests name * rename optional var and renable format condition * cond try fix * nit * again --------- Co-authored-by: Josh Reini <[email protected]>

#839) Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 4.4.2 to 4.5.2. - [Release notes](https://github.com/vitejs/vite/releases) - [Changelog](https://github.com/vitejs/vite/blob/v4.5.2/packages/vite/CHANGELOG.md) - [Commits](https://github.com/vitejs/vite/commits/v4.5.2/packages/vite) --- updated-dependencies: - dependency-name: vite dependency-type: direct:development ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Josh Reini <[email protected]>

* make more tests pass * small changes * more tests working * skip moderation * test for multiple models * more passing * unused e removed * fix typing issues * more cot tests * incorrect prompt * more cot reasons tests * stereotypes more extreme * improve stereotyping prompt * typo * unittest only gpt-3.5-turbo * add missing import * mark calibration as optional test * fix typo * move oai import to top[] * oai imports for all testss

* bump * bump

Co-authored-by: joshreini1 <[email protected]>

* debug why static tests did not fail * check more base modules * update module hierarchy doc and delete another deprecated module * trubot is optional * don't try to run a script in static tests * Don't try to import migrations env * typo * nit * cleanup and assertion failure messages * pinecone optional message * update pinecone usage * note

* prototype * update md * fix run_feedback * update * update * update

review-notebook-app · 2024-02-08T02:17:18Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

joshreini1 and others added 30 commits October 6, 2023 16:39

bump version quickstarts

5c88a5e

version bump 0.15.1 (#491)

5cb5384

* version bump combine nb to docs * hotfix

first (#493)

86ab251

LLMProvider use bugfixes (#495)

7c29149

* first try * starting feedback imp tests * working on feedback tests * nits * more tests * more adjustments * disable unit tests for now * added in-domain tests variants to run for now

Update version on key notebooks (#498)

1b21016

Co-authored-by: Shayak Sen <[email protected]>

[MLNN-1020] App runner UI updates (#503)

0325309

Fix groundedness aggregation flakiness + incorrect 0 resolution. (#501)

5a10f26

* if groundedness output is not list, set as list so agg functions properly * fix 0 resolving to null then -1 bug * fix groundedness measure error, move warning from init to only renamed method

[MLNN-1053] Add groundedness to Pinecone notebook (#506)

f473da1

* Add groundedness to Pinecone notebook * Fix definition per suggestion --------- Co-authored-by: Josh Reini <[email protected]>

[MLNN-1053] Update dependencies in pinecone notebook (#507)

f21e13c

* Update dependencies for Pinecone example * Format notebook (isort & yapf)

Automated File Generation from Docs Notebook Changes (#508)

93ea345

Co-authored-by: piotrm0 <[email protected]>

handle no pii (#504)

861fc54

* handle no pii * to do on error handling * logger.debug no pii found

Update Instrumentation Overview page: fix link, add trucustom (#505)

86901c9

* fix link, add trucustom * finish update * Update basic_instrumentation.ipynb remove duplicated text

bugfixes (#510)

0259183

* fix * fixes * component fixes * make backwards compatible * remove print message * fix another old __call__ usage meaning "get" * remove dist form gitignore * include dist * typing fix

dashboard appui quickstart (#511)

ffaf28e

* creating quickstart notebook for appui * fixing some runner bugs and adding info to quickstart * add screenshots to repo * clear output * unneeded try

fix (#513)

4600159

Release branch 0.16.0 (#514)

cacdbff

* version bump to 0.16.0 * combine nb to docs

example appui in quickstart

affab95

Revert "Release branch 0.16.0 (#514)"

ef8297a

This reverts commit cacdbff.

Revert "Revert "Release branch 0.16.0 (#514)""

0f3e1e0

This reverts commit ef8297a.

Revert "example appui in quickstart"

6bde633

This reverts commit affab95.

dynamic leaderboard (#523)

3b70a57

dedup (#517)

3d3ad17

* dedup * delete dups * another try * make one assets and images folder instead of two --------- Co-authored-by: Josh Reini <[email protected]>

Automated File Generation from Docs Notebook Changes (#526)

8c875d3

Co-authored-by: joshreini1 <[email protected]>

Update use_cases_production.md (#516)

550cd9c

* Update use_cases_production.md update wrong descriptions for azure/aws * switch to new deferred example

piotrm0 and others added 26 commits January 31, 2024 21:29

few more ellipses (#843)

c18afbc

TruLens-Eval v0.22.0 release (#851)

7eea89d

* bump * bump

Automated File Generation from Docs Notebook Changes (#853)

5e55846

Co-authored-by: joshreini1 <[email protected]>

Merge remote-tracking branch 'origin/main' into piotrm/deferred_mem

d4a513c

adding unit tests

6a31f4c

cleanup

121f04a

forgot typevar

7ae6c1f

add fake parameterizable Queue type

a6219e0

more unit tests

ef92824

more Tru class testing

8a186e7

fix bad merge

1b4d784

nits

4fb15b7

Randomly run evals based on record_id hash (#850)

29d93a8

* prototype * update md * fix run_feedback * update * update * update

fix typo (#857)

9f917ed

change llama test example

910fd4c

Merge remote-tracking branch 'origin/main' into piotrm/deferred_mem

b4b9df9

Fix typo and adjust some debug printouts. (#866)

99d747c

nits and move chain tests to optional requirements

1e2ed16

nits

e55cafa

Merge remote-tracking branch 'origin/main' into piotrm/deferred_mem

73e9ce8

debugging generators (streaming=true) in llama_index

c87c99f

piotrm0 marked this pull request as draft February 8, 2024 02:17

remove print

b4634da

sfc-gh-dhuang force-pushed the main branch from 7e0fbf3 to 26848c6 Compare June 29, 2024 04:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DRAFT] generators debugging #875

[DRAFT] generators debugging #875

piotrm0 commented Feb 8, 2024

review-notebook-app bot commented Feb 8, 2024

[DRAFT] generators debugging #875

Are you sure you want to change the base?

[DRAFT] generators debugging #875

Conversation

piotrm0 commented Feb 8, 2024

review-notebook-app bot commented Feb 8, 2024