Skip to content

Commit

Permalink
Corrections on required field specification for faithfulness and corr…
Browse files Browse the repository at this point in the history
…ectness
  • Loading branch information
andrew-lastmile authored Jan 8, 2025
1 parent 6699766 commit f0ab9cc
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions website/docs/autoeval/metrics.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ All evaluators work on some combination of the following properties:
- `output`: The response generated by the application (e.g. LLM generation)
- `ground_truth`: Factual data, either the ideal correct response, or context used to generate the output (e.g. data retrieved from a vector DB)

Running evals is possible both from the [Model Console](https://lastmileai.dev/models) dashboard as well as the [API](/sdk).
Running evals is possible both from the [Model Console](https://lastmileai.dev/models) dashboard as well as the [API/SDKs](/sdk).

You can run a one-off evaluation from the model playground. Open any metric in the [Model Console](https://lastmileai.dev/models),
and click `Run Model` (the play button) to compute a score on some provided data.
Expand All @@ -38,7 +38,7 @@ The task is a Natural Language Inference (NLI) task that measures if the output
Required fields:
- `input` - e.g. user query
- `output` - LLM response
- `ground_truth` - ideal response, or context used to generate output.
- `ground_truth` - context used to generate output

:::tip
The LastMile faithfulness evaluator is able to identify subtle mistakes that an LLM might make.
Expand Down Expand Up @@ -358,7 +358,7 @@ It returns a 0->1 score which answers: _"Does the LLM-generated response correct
Required fields:
- `input` - Input prompt
- `output` - LLM response
- `ground_truth` - context used to generate output.
- `ground_truth` - Ideal response or correct answer

#### Usage Guide

Expand Down Expand Up @@ -650,4 +650,4 @@ console.table(evalResult);

:::info
Don't see a metric that perfectly fits your use case? [Design your own](/autoeval/fine-tune) with the fine-tuning service, or [get in touch](https://discord.com/invite/xBhNKTetGx)!
:::
:::

0 comments on commit f0ab9cc

Please sign in to comment.