-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refined prompt construction for feedback #1058
Draft
piotrm0
wants to merge
529
commits into
main
Choose a base branch
from
piotrm/llm_feedback_options
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* first * working version, maybe * note * fixes * note * nit
Co-authored-by: Piotr Mardziel <[email protected]>
* remove extra reset cell * fix langchain prompt import, quickstart * async fix imports, update install pins * fix langchain prompttemplate imports * ada embeddings in quickstart, pinned install versions * pin package versions * clear output
Co-authored-by: joshreini1 <[email protected]>
Co-authored-by: Josh Reini <[email protected]>
* fix quickstart imports * fix langchain trulens imports
Co-authored-by: joshreini1 <[email protected]>
Co-authored-by: Josh Reini <[email protected]>
* move model comparison to use cases (expected location) * multimodal example with trullama * remove extra commit
Co-authored-by: Shayak Sen <[email protected]> Co-authored-by: Josh Reini <[email protected]>
Co-authored-by: Josh Reini <[email protected]>
Co-authored-by: Josh Reini <[email protected]>
* version bump quickstarts * version bump py quickstarts * version bump all_tools * format quickstarts * version bump init * update package one-liner
* MLNN-1128 * feedback
* added wrapper for dynamically generated functions in boto3 * docs and remove debug prints * typo * remove unused * testing out streaming counting
not_toxic -> toxic, fix docstring. code itself is correct.
Co-authored-by: Shayak Sen <[email protected]>
* add generator for answer relevance using SummEval save wip work test (#627) wip * remove sqlit * groundedness eval across 100 examples * rm * Update groundedness_smoke_tests.ipynb fix typo * typo --------- Co-authored-by: Josh Reini <[email protected]>
* add langchain prompt template * add langchain template to _langchain_evaluate * pass through criteria, use standard cot reasons template
* change assertion from dict to object * get model, usage as attr not from dict
Co-authored-by: Daniel <[email protected]>
Co-authored-by: Josh Reini <[email protected]>
* fix: italise TruLens-Eval ref * fix: italise TruLens-Eval ref in root scripts. * docs: add contribution instructions for proper names with mod to inverted commas. * Update standards.md Markdown lint prefers _ to * for emphasis. --------- Co-authored-by: Josh Reini <[email protected]> Co-authored-by: Piotr Mardziel <[email protected]>
* working on glossary * finish glossary draft * nits * Add some info regarding makefiles. --------- Co-authored-by: Josh Reini <[email protected]>
* more pipelines docs * adjust trigger for release tests * one more time * one more time * again * one more * one more try * nit * add a docs pipeline --------- Co-authored-by: Josh Reini <[email protected]>
Co-authored-by: piotrm0 <[email protected]> Co-authored-by: Josh Reini <[email protected]>
* Add if_missing. * add new enum to docs feedbacks page * make re_0_10 rating a bit more robust * adjust rating extraction test * check for integers only and remove unneeded imports
* fix image * feedback_function index updates * implementation and provider docs * feedback implementations llm-based * classificiaton implementations * feedback base provider docstrings * formatting of numbered lists * more example admonitions * tru custom app docs * isntrumentation api docs * virtual app api ref * add missing title
* fix some proper names * nits * too many, giving up * remove _ from mkdocs * llama indexes --------- Co-authored-by: Josh Reini <[email protected]>
* Spell fix * Added user feedback button to the sidebar * Updated share feedback text
* pin packaging * remove packaging, remove base langchain * remove langchain requirement * update comment * move nltk to required * nltk required, download punkt on init * add packaging requirement * move punkt download * bump langchain version * pin packaging 23.2 * logger debug for optional packages
* Fix import and favicon * Update requirements.txt --------- Co-authored-by: Josh Reini <[email protected]>
* removed pkg_resources * add reqs * remove duplicate * preserve note from duplicate * format * fix for py3.8 * format * nit * remove distutils as well and add notes * notes * nits * fix static_resource for py38 again
…val utils, and docs update (#991) * implement recommendation metrics for benchmark framework ece fix Revert "ece fix" This reverts commit c58ee7e. run actual evals add context relevance inference api to hugs ffs fmt larger dataset + smarter backoff + recall nb update (wip) fix how we handle ties in precision and recall saving results for GPT-3.5, GPT-4, Claude-1, and Claude-2 remove secrets * finished evals with truera context relevance model * add Verb 2S top 1 prompt * update ECE method pushed to server * save csv results for tmp scaling * save * implement meeting bank generator * example notebook for comprehensiveness benchmark WIP * gi# This is a combination of 2 commits. gainsight benchmarking done remove secrets * prepping comprehensiveness benchmark notebook * remove unused test script * moving results csvs * updates models * intermediate results code change * good stopping point * cleanup * symlink docs * huge doc updates * fix doc symlink * fix score range in docstring * add docstring for truera's context relevance model * update comprehensiveness notebook * update comprehensiveness notebook * fix * file renames * new symlinks * update mkdcos --------- Co-authored-by: Josh Reini <[email protected]> Co-authored-by: Josh Reini <[email protected]>
* atlas quickstart * header updates
* first * assistants api (rag) quickstart * fix indent
dosubot
bot
added
the
size:L
This PR changes 100-499 lines, ignoring generated files.
label
Apr 8, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Items to add to release announcement:
Other details that are good to know but need not be announced:
Work in progress. Designing prompts for feedback from several common parts and allow different sizes of such prompts depending on the allowable space: