-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DRAFT] generators debugging #875
Draft
piotrm0
wants to merge
454
commits into
main
Choose a base branch
from
piotrm/generators_debugging
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* version bump combine nb to docs * hotfix
* first try * starting feedback imp tests * working on feedback tests * nits * more tests * more adjustments * disable unit tests for now * added in-domain tests variants to run for now
Co-authored-by: Shayak Sen <[email protected]>
* if groundedness output is not list, set as list so agg functions properly * fix 0 resolving to null then -1 bug * fix groundedness measure error, move warning from init to only renamed method
* Add groundedness to Pinecone notebook * Fix definition per suggestion --------- Co-authored-by: Josh Reini <[email protected]>
* add langchain multi-retrieval agents + chroma vector mgr example * basic one feedback function working e2e * fix deps version * add response length custom feedback func * update notebook with markdown + more feedback functions + deferred mode add markdown and colab widget remove ckp
* Update dependencies for Pinecone example * Format notebook (isort & yapf)
* split off work on threading issues * work on dummy example * prototyping various thread robustness solutions * work * working on threading and feedback results * more ignores * nevermind that last gitignore addition * remove unneeded * added feedback result retrieval into langchain quickstart * don't use submit inside feedback functions * what the last thing said
Co-authored-by: piotrm0 <[email protected]>
* updating JSONPath, renamed to Lens * added storage of paths as strings instead of json structures * python parsing variations * more parsing variants for python 3.11 * makefile * version bounds format --------- Co-authored-by: Josh Reini <[email protected]>
* handle no pii * to do on error handling * logger.debug no pii found
* fix link, add trucustom * finish update * Update basic_instrumentation.ipynb remove duplicated text
* fix * fixes * component fixes * make backwards compatible * remove print message * fix another old __call__ usage meaning "get" * remove dist form gitignore * include dist * typing fix
* creating quickstart notebook for appui * fixing some runner bugs and adding info to quickstart * add screenshots to repo * clear output * unneeded try
* version bump to 0.16.0 * combine nb to docs
This reverts commit cacdbff.
This reverts commit ef8297a.
This reverts commit affab95.
* dedup * delete dups * another try * make one assets and images folder instead of two --------- Co-authored-by: Josh Reini <[email protected]>
Co-authored-by: joshreini1 <[email protected]>
* Update use_cases_production.md update wrong descriptions for azure/aws * switch to new deferred example
* Save working code for feedback direction * Use supplied_name as key if any for direction lookup * Fix direction lookup for cell style * Use OpenAI moderation category score as is (lower score is better) * Update OpenAI moderation API test cases
* added SummEval test generator groundedness smoke test for huggingface NLI model and open ai remove extra files clean up clean up * addressed pr comments * removed sqlite * rename f_groundtruth to f_mae * link docs * fixed local path * added gpt-3.5-turbo vs gpt-4 * numeric differenc to mae * to mae, groundedness notebook * clear noisy cell output * allow messages and prompt args * answer relevance smoke tests updated * remove cot versions * re-run context relevance with mae * update function definitions docs with link, fix broken link * fix symlink * fix function definitions links * remove cot definitions * small nits to md --------- Co-authored-by: Josh Reini <[email protected]>
* Improve cot reasoning Add prompting to influence llm to tie reasons back to the evaluation being performed * Update prompts.py Add criteria and tie it back language to cot template * update docstring to include reasons template * undo * Update prompts.py revert template back to supporting evidence * reason filtering only to supporting evidence * revert docstring * small change * update extract score/reasons * remove unneeded line. extract_score/reason not used for groundedness * add higher is better, fix title for moderation notebook
* update azure example, also show provider extension * update trulens version * remove force dashboard * add score-only example * custom feedback docs * refactor for user-facing generate_score and generate_score_and_reasons methods * update azure example with more user-friendly methods * update bedrock with more user-friendly methods * user friendly methods for provider extension * types in docstring
* first * CI update * global import static test * disable py312 test * add nltk optional fix * remove unrelated * add format and python bound * nit * adjust matrix cell name for ordering * spec * disable tests when optional package not installed * adjust ci script * syntax issues * typo * test * remove more * newline * name first * comment * nit * matrix no sequences * remove duplicate matrix cell * displayname again * unexpected symbol * hmm * fix test import * optional tests marking * imports tests * few more ellipses * remove non-existant import * expected error * fixing optionals * message on subtest * add subtests requirement * fix discovered import bugs * more fixes * adjust ipython requirement * downgrade bound even more for python 3.9 * more fixes * ipython fix * remove ipykernel installation in unit tests * change pr job name * name * more informative tests name * rename optional var and renable format condition * cond try fix * nit * again --------- Co-authored-by: Josh Reini <[email protected]>
#839) Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 4.4.2 to 4.5.2. - [Release notes](https://github.com/vitejs/vite/releases) - [Changelog](https://github.com/vitejs/vite/blob/v4.5.2/packages/vite/CHANGELOG.md) - [Commits](https://github.com/vitejs/vite/commits/v4.5.2/packages/vite) --- updated-dependencies: - dependency-name: vite dependency-type: direct:development ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Josh Reini <[email protected]>
* make more tests pass * small changes * more tests working * skip moderation * test for multiple models * more passing * unused e removed * fix typing issues * more cot tests * incorrect prompt * more cot reasons tests * stereotypes more extreme * improve stereotyping prompt * typo * unittest only gpt-3.5-turbo * add missing import * mark calibration as optional test * fix typo * move oai import to top[] * oai imports for all testss
* bump * bump
Co-authored-by: joshreini1 <[email protected]>
* debug why static tests did not fail * check more base modules * update module hierarchy doc and delete another deprecated module * trubot is optional * don't try to run a script in static tests * Don't try to import migrations env * typo * nit * cleanup and assertion failure messages * pinecone optional message * update pinecone usage * note
* prototype * update md * fix run_feedback * update * update * update
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Debugging generator serialization issues responsible for at least two reported Issues.