Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(llm-observability): llm metrics and feedback #27785

Merged
merged 29 commits into from
Jan 24, 2025

Conversation

skoob13
Copy link
Contributor

@skoob13 skoob13 commented Jan 22, 2025

Problem

LLM Observability should provide functionality to gather quantitative and non-quantitative feedback from users. We've agreed on two events:

  1. $ai_metric allows you to create your own custom metrics.
    • $ai_metric_name - optional metric name.
    • $ai_metric_value - required metric value.
  2. $ai_feedback allows you to capture user-written feedback.
    • $ai_feedback_text - required text.

Screenshot

Screenshot 2025-01-23 at 13 10 50

Changes

  • Support metric and feedback events in the traces query runner.
  • Output metrics and feedback in the trace header.
  • Make the trace header slightly lighter and smaller.
  • Dogfood both events.
  • Add metrics and feedback to the traces scene I've decided to do that in a separate PR.

Does this work well for both Cloud and self-hosted?

Yes

How did you test this code?

Unit & manual testing.

Copy link
Contributor

github-actions bot commented Jan 22, 2025

Size Change: -24 B (0%)

Total Size: 1.16 MB

ℹ️ View Unchanged
Filename Size Change
frontend/dist/toolbar.js 1.16 MB -24 B (0%)

compressed-size-action

@skoob13 skoob13 changed the title feat(llm-observability): llm metrics feat(llm-observability): llm metrics and feedback Jan 23, 2025
@skoob13 skoob13 marked this pull request as ready for review January 23, 2025 16:10
@skoob13 skoob13 requested review from k11kirky and Twixes January 23, 2025 16:10
@posthog-bot
Copy link
Contributor

📸 UI snapshots have been updated

1 snapshot changes in total. 0 added, 1 modified, 0 deleted:

  • chromium: 0 added, 1 modified, 0 deleted (diff for shard 1)
  • webkit: 0 added, 0 modified, 0 deleted

Triggered by this commit.

👉 Review this PR's diff of snapshots.

@Twixes
Copy link
Member

Twixes commented Jan 23, 2025

The UI changes overlap a bit with my #27841, but we'll figure it out

Copy link
Member

@Twixes Twixes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to go. Leaving a few comments about the UX of capturing and displaying metrics/feedback, but do with this as you will

Comment on lines +23 to +24
<br />
<br />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why two new lines and not some margin?

export function MetricTag({ properties }: MetricTagProps): JSX.Element {
const { $ai_metric_name: metricName, $ai_metric_value: metricValue } = properties
const strValue = String(metricValue)
const isValueLong = strValue.length > 10
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10 feels super short honestly!

properties: Record<string, any>
}

export function MetricTag({ properties }: MetricTagProps): JSX.Element {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For metrics, it'd be useful to say that this is a metric, as it's sort of unusually magical that this tag is pulled in from a completely different event (the magic could also be clearer with Feedback, but there at least the event has feedback in the name, while metrics have arbitrary names)

Screenshot 2025-01-24 at 12 59 50

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree with you. I'd also add a general overview of a trace with its metrics and feedback, which would be much easier to read. These short tags were supposed to help you if you want to find metrics or feedback quickly, for example, inside a generation.

Comment on lines +342 to 345
posthog.capture('$ai_feedback', {
$ai_feedback_text: feedback,
$ai_trace_id: traceId,
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We must add a posthog-js method for this – users should never have to type dollar-prefixed properties, because they're the features we offer out of the box. This way we ensure the SDKs and the product's UI all work together neatly, and the user has a smooth time implementing. Let's do this as a follow-up

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. If we keep adding functionality, the bundle size will concern me, but the plugins probably have already solved this.

@skoob13 skoob13 merged commit 54da829 into master Jan 24, 2025
99 checks passed
@skoob13 skoob13 deleted the feat/llm-observability-scores branch January 24, 2025 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants