Skip to content

Commit

Permalink
Add blog site and docs grouping (#1499)
Browse files Browse the repository at this point in the history
* move gtag to head

* fix src docstrings and links

* restructure docs and add blog

* hide watermark for demo

* format blog posts and redirect

* rm links

* add assets to lfs

* update paths
  • Loading branch information
sfc-gh-chu authored Sep 25, 2024
1 parent cf01a93 commit df64fec
Show file tree
Hide file tree
Showing 176 changed files with 387 additions and 534 deletions.
3 changes: 3 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
*.png filter=lfs diff=lfs merge=lfs -text
*.jpg filter=lfs diff=lfs merge=lfs -text
*.gif filter=lfs diff=lfs merge=lfs -text
4 changes: 2 additions & 2 deletions POLICIES.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,13 +34,13 @@ will occur at the introduction of the warning period.

- Starting 1.0, the `trulens_eval` package is being deprecated in favor of
`trulens` and several associated required and optional packages. See
[trulens_eval migration](/trulens/guides/trulens_eval_migration) for details.
[trulens_eval migration](/component_guides/other/trulens_eval_migration/) for details.

- Warning period: 2024-09-01 (`trulens-eval==1.0.1`) to 2024-10-14.
Backwards compatibility during the warning period is provided by the new
content of the `trulens_eval` package which provides aliases to the features
in their new locations. See
[trulens_eval](trulens/api/trulens_eval/index.md).
[trulens_eval](/reference/trulens_eval/index.md).

- Deprecated period: 2024-10-14 to 2025-12-01. Usage of `trulens_eval` will
produce errors indicating deprecation.
Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
![GitHub](https://img.shields.io/github/license/truera/trulens)
![PyPI - Downloads](https://img.shields.io/pypi/dm/trulens)
[![Slack](https://img.shields.io/badge/slack-join-green?logo=slack)](https://communityinviter.com/apps/aiqualityforum/josh)
[![Docs](https://img.shields.io/badge/docs-trulens.org-blue)](https://www.trulens.org/trulens/getting_started/)
[![Docs](https://img.shields.io/badge/docs-trulens.org-blue)](https://www.trulens.org/getting_started/)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/truera/trulens/blob/main/examples/quickstart/langchain_quickstart.ipynb)

# 🦑 Welcome to TruLens!
Expand All @@ -19,9 +19,9 @@ Fine-grained, stack-agnostic instrumentation and comprehensive evaluations help
you to identify failure modes & systematically iterate to improve your
application.

Read more about the core concepts behind TruLens including [Feedback Functions](https://www.trulens.org/trulens/getting_started/core_concepts/feedback_functions/),
[The RAG Triad](https://www.trulens.org/trulens/getting_started/core_concepts/rag_triad/),
and [Honest, Harmless and Helpful Evals](https://www.trulens.org/trulens/getting_started/core_concepts/honest_harmless_helpful_evals/).
Read more about the core concepts behind TruLens including [Feedback Functions](https://www.trulens.org/getting_started/core_concepts/feedback_functions/),
[The RAG Triad](https://www.trulens.org/getting_started/core_concepts/rag_triad/),
and [Honest, Harmless and Helpful Evals](https://www.trulens.org/getting_started/core_concepts/honest_harmless_helpful_evals/).

## TruLens in the development workflow

Expand Down
Binary file modified docs/assets/favicon/android-chrome-192x192.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/favicon/android-chrome-512x512.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/favicon/apple-touch-icon.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/favicon/favicon-16x16.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/favicon/favicon-32x32.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/favicon/mstile-144x144.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/favicon/mstile-150x150.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/favicon/mstile-310x150.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/favicon/mstile-310x310.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/favicon/mstile-70x70.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/images/Chain_Explore.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/images/Evaluations.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/images/Honest_Harmless_Helpful_Evals.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/images/Leaderboard.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/images/Neural_Network_Explainability.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/images/RAG_Triad.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/images/Range_of_Feedback_Functions.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/images/TruLens_Architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/images/appui/apps.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/images/appui/blank_session.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/images/appui/running_session.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/images/trulens_1_release_graphic_modular.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/images/trulens_1_release_graphic_split.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions docs/blog/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Blog
Original file line number Diff line number Diff line change
@@ -1,13 +1,23 @@
---
categories:
- General
date: 2024-08-30
---


# Moving to TruLens v1: Reliable and Modular Logging and Evaluation

It has always been our goal to make it easy to build trustworthy LLM applications. Since we launched last May, the package has grown up before our eyes, morphing from a hacked-together addition to an existing project (`trulens-explain`) to a thriving, agnostic standard for tracking and evaluating LLM apps. Along the way, we’ve experienced growing pains and discovered inefficiencies in the way TruLens was built. We’ve also heard that the reasons people use TruLens today are diverse, and many of its use cases do not require its full footprint. Today we’re announcing an extensive re-architecture of TruLens that aims to give developers a stable, modular platform for logging and evaluation they can rely on.
It has always been our goal to make it easy to build trustworthy LLM applications. Since we launched last May, the package has grown up before our eyes, morphing from a hacked-together addition to an existing project (`trulens-explain`) to a thriving, agnostic standard for tracking and evaluating LLM apps. Along the way, we’ve experienced growing pains and discovered inefficiencies in the way TruLens was built. We’ve also heard that the reasons people use TruLens today are diverse, and many of its use cases do not require its full footprint.

Today we’re announcing an extensive re-architecture of TruLens that aims to give developers a stable, modular platform for logging and evaluation they can rely on.

<!-- more -->

## **Split off trulens-eval from trulens-explain**

Split off `trulens-eval` from `trulens-explain`, and let `trulens-eval` take over the `trulens` package name. _TruLens-Eval_ is now renamed to _TruLens_ and sits at the root of the [_TruLens_ repo](https://github.com/truera/trulens), while _TruLens-Explain_ has been moved to its own [repository](https://github.com/truera/trulens_explain), and is installable at `trulens-explain`.

![TruLens 1.0 Release Graphics](../assets/images/trulens_1_release_graphic_split.png)
![TruLens 1.0 Release Graphics](../../assets/images/trulens_1_release_graphic_split.png)

## **Separate TruLens-Eval into different trulens packages**

Expand All @@ -20,7 +30,7 @@ Next, we modularized _TruLens_ into a family of different packages, described be
* `trulens-providers-` prefixed package describes a set of integrations with other libraries for running feedback functions. Today, we offer an extensive set of integrations that allow you to run feedback functions on top of virtually any LLM. These integrations can be installed as standalone packages, and include: `trulens-providers-openai`, `trulens-providers-huggingface`, `trulens-providers-litellm`, `trulens-providers-langchain`, `trulens-providers-bedrock`, `trulens-providers-cortex`.
* `trulens-connectors-` provide ways to log _TruLens_ traces and evaluations to other databases. In addition to connect to any `sqlalchemy` database with `trulens-core`, we've added with `trulens-connectors-snowflake` tailored specifically to connecting to Snowflake. We plan to add more connectors over time.

![TruLens 1.0 Release Graphics](../assets/images/trulens_1_release_graphic_modular.png)
![TruLens 1.0 Release Graphics](../../assets/images/trulens_1_release_graphic_modular.png)

## **Versioning and Backwards Compatibility**

Expand All @@ -30,13 +40,13 @@ The base install of `trulens` will install `trulens-core`, `trulens-feedback` an

Starting 1.0, the `trulens_eval` package is being deprecated in favor of `trulens` and several associated required and optional packages.

Until 2024-10-14, backwards compatibility during the warning period is provided by the new content of the `trulens_eval` package which provides aliases to the in their new locations. See [trulens_eval](./api/trulens_eval/index.md).
Until 2024-10-14, backwards compatibility during the warning period is provided by the new content of the `trulens_eval` package which provides aliases to the in their new locations. See [trulens_eval](../../reference/trulens_eval/index.md).

Starting 2024-10-15 until 2025-12-01. Usage of `trulens_eval` will produce errors indicating deprecation.

Beginning 2024-12-01 Installation of the latest version of `trulens_eval` will be an error itself with a message that `trulens_eval` is no longer maintained.

Along with this change, we’ve also included a [migration guide](./guides/trulens_eval_migration.md) for moving to TruLens v1.
Along with this change, we’ve also included a [migration guide](../../component_guides/other/trulens_eval_migration.md) for moving to TruLens v1.

Please give us feedback on GitHub by creating [issues](https://github.com/truera/trulens/issues) and starting [discussions](https://github.com/truera/trulens/discussions). You can also chime in on [slack](https://communityinviter.com/apps/aiqualityforum/josh).

Expand Down Expand Up @@ -272,7 +282,7 @@ To bring these changes to life, we've also added new filters to the Leaderboard

## **First-class support for Ground Truth Evaluation**

Along with the high level changes in TruLens v1, ground truth can now be persisted in SQL-compatible datastores and loaded on demand as pandas dataframe objects in memory as required. By enabling the persistence of ground truth data, you can now easily store and share ground truth data used across your team.
Along with the high level changes in TruLens v1, ground truth can now be persisted in SQL-compatible datastores and loaded on demand as pandas DataFrame objects in memory as required. By enabling the persistence of ground truth data, you can now easily store and share ground truth data used across your team.

!!! example "Using Ground Truth Data"

Expand Down Expand Up @@ -319,22 +329,22 @@ Along with the high level changes in TruLens v1, ground truth can now be persist
).on_input_output()
```

See this in action in the new [Ground Truth Persistence Quickstart](./getting_started/quickstarts/groundtruth_dataset_persistence.ipynb)
See this in action in the new [Ground Truth Persistence Quickstart](../../getting_started/quickstarts/groundtruth_dataset_persistence.ipynb)

## **New Component Guides and TruLens Cookbook**

On the top-level of TruLens docs, we previously had separated out Evaluation, Evaluation Benchmarks, Tracking and Guardrails. These are now combined to form the new [Component Guides](../tracking/instrumentation/).
On the top-level of TruLens docs, we previously had separated out Evaluation, Evaluation Benchmarks, Tracking and Guardrails. These are now combined to form the new [Component Guides](../../component_guides/index.md).

We also pulled in our extensive GitHub examples library directly into docs. This should make it easier for you to learn about all of the different ways to get started using TruLens. You can find these examples in the top-level navigation under ["Cookbook"](../../examples/).
We also pulled in our extensive GitHub examples library directly into docs. This should make it easier for you to learn about all of the different ways to get started using TruLens. You can find these examples in the top-level navigation under ["Cookbook"](../../cookbook/index.md).

## **Automatic Migration with Grit**

To assist you in migrating your codebase to _TruLens_ to v1.0, we've published a `grit` pattern. You can migrade your codebase [online](https://docs.grit.io/patterns/library/trulens_eval_migration#migrate-and-use-tru-session), or by using `grit` on the command line.
To assist you in migrating your codebase to _TruLens_ to v1.0, we've published a `grit` pattern. You can migrate your codebase [online](https://docs.grit.io/patterns/library/trulens_eval_migration#migrate-and-use-tru-session), or by using `grit` on the command line.

Read more detailed instructions in our [migration guide](./guides/trulens_eval_migration.md)
Read more detailed instructions in our [migration guide](../../component_guides/other/trulens_eval_migration.md)

Be sure to audit its changes: we suggest ensuring you have a clean working tree beforehand.

## **Conclusion**

Ready to get started with the v1 stable release of TruLens? Check out our [migration guide](./guides/trulens_eval_migration.md), or just jump in to the [quickstart](./getting_started/quickstarts/quickstart.ipynb)!
Ready to get started with the v1 stable release of TruLens? Check out our [migration guide](../../component_guides/other/trulens_eval_migration.md), or just jump in to the [quickstart](../../getting_started/quickstarts/quickstart.ipynb)!
96 changes: 96 additions & 0 deletions docs/blog/posts/trulens_1_1_dashboard_updates.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
---
categories:
- General
date: 2024-09-25
---

# What's new in TruLens 1.1: Dashboard Comparison View, Multi-App Support, Metadata Editing, and More!

TruLens 1.1.0 has been released! This release includes a number of improvements to the TruLens dashboard, including a new comparison view and a more intuitive user interface. We have also made several improvements performance and usability.

<!-- more -->

## Dashboard Highlights

An overhaul of the TruLens dashboard has been released with major features and improvements. Here are some of the highlights:

### Global Enhancements

![The New TruLens Dashboard](../assets/trulens_1_1_dashboard_updates/dashboard_global_features.gif)

#### Global app selector

TruLens 1.0 introduced app versioning, allowing performance of their LLM apps to be tracked across different versions. On multi-app tables, the dashboard sidebar now includes an app selector to quickly navigate to the desired application.

#### App version and Record search and filtering

All pages in the dashboard now include relevant search and filter options to identify app versions and records quickly. The search bar allows filtering records and app versions by name or by other metadata fields. This makes it easy to find specific records or applications and compare their performance over time.

#### Performance enhancements

TruLens 1.1.0 includes several performance enhancements to improve the scalability and speed of the dashboard. The dashboard now queries only the most recent records unless specified otherwise. This helps prevent out-of-memory errors and improves the overall performance of the dashboard.

Furthermore, all record and app data is now cached locally, reducing network latency on refreshes. This results in faster load times and a more responsive user experience. The cache is cleared automatically every 15 minutes or manually with the new `Refresh Data` button.

### Leaderboard

![Leaderboard enhancements](../assets/trulens_1_1_dashboard_updates/leaderboard_metadata.gif)

The leaderboard is now displayed in a tabular format, with each row representing a different application version. The grid data can be sorted and filtered.

#### App Version Pinning

App versions can now be pinned to the top of the leaderboard for easy access. This makes it easy to track the performance of specific versions over time. Pinned versions are highlighted for easy identification and can be filtered to with a toggle.

#### Metadata Editing

To better identify and track application versions, app metadata visibility is a central part of this leaderboard update. In addition to being displayed on the leaderboard, metadata fields are now editable after ingestion by double-clicking the cell, or bulk selecting and choosing the `Add/Edit Metadata` option. In addition, new fields can be added with the `Add/Edit Metadata` button.

A selector at the top of the leaderboard allows toggling which app metadata fields are displayed to better customize the view.

#### Virtual App Creation

To bring in evaluation data from a non-TruLens app (e.g another runtime environment or benchmark by a third-party source), the `Add Virtual App` button has been added to the leaderboard! This creates a virtual app with user-defined metadata fields and evaluation data that can be used in the leaderboard and comparison view.

### Comparison View

This update introduces a brand-new comparison page that enables the comparison of up to 5 different app versions side by side.

#### App-level comparison

![App-level comparison](../assets/trulens_1_1_dashboard_updates/compare_app.png)

The comparison view allows performance comparisons across different app versions side by side. The aggregate feedback function results for each app version is plotted across each of the shared feedback functions, making it easy to see how the performance has changed.

#### Record-level comparison

![Record-level comparison](../assets/trulens_1_1_dashboard_updates/compare_record.png)

To deep dive into the performance of individual records, the comparison view also allows comparison of overlapping records side by side. The dashboard computes a diff or variance score (depending on the number of apps compared against) to identify interesting or anomalous records which have the most significant performance differences. In addition to viewing the distribution of feedback scores, this page also displays the trace data of each record side by side.

### Records Page

![Records Page Flow](../assets/trulens_1_1_dashboard_updates/record_page.gif)

The records page has been updated to include a more intuitive flow for viewing and comparing records. The page now includes a search bar to quickly find specific records as well as matching app metadata filters.

#### Additional features

- URL serialization of key dashboard states
- Dark mode
- Improved error handling
- Fragmented rendering


#### Try it out!

We hope you enjoy the new features and improvements in TruLens 1.1.0! To get started, use [`run_dashboard`][trulens.dashboard.run.run_dashboard] with a TruSession object:


```python
from trulens.core import TruSession
from trulens.dashboard import run_dashboard

session = TruSession(...)
run_dashboard(session)
```
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ That is,
is a plain python method that accepts the prompt and context, both strings, and
produces a float (assumed to be between 0.0 and 1.0).

Read more about [feedback implementations](../feedback_implementations/index.md)
Read more about [feedback implementations](./feedback_implementations/index.md)

## Feedback constructor

Expand All @@ -70,8 +70,8 @@ states that the first two argument to
respectively.

Read more about [argument
specification](../feedback_selectors/selecting_components.md) and [selector
shortcuts](../feedback_selectors/selector_shortcuts.md).
specification](./feedback_selectors/selecting_components.md) and [selector
shortcuts](./feedback_selectors/selector_shortcuts.md).

## Aggregation specification

Expand All @@ -85,4 +85,4 @@ next section. This function is called on the `float` results of feedback
function evaluations to produce a single float. The default is
[numpy.mean][numpy.mean].

Read more about [feedback aggregation](../feedback_aggregation/index.md).
Read more about [feedback aggregation](feedback_aggregation.md).
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Feedback Implementations

TruLens constructs feedback functions by a [**_feedback provider_**][trulens.core.feedback.Provider], and [**_feedback implementation_**](../feedback_implementations/index.md).
TruLens constructs feedback functions by a [**_feedback provider_**][trulens.core.feedback.Provider], and **_feedback implementation_**.

This page documents the feedback implementations available in _TruLens_.

Expand Down
25 changes: 25 additions & 0 deletions docs/component_guides/evaluation/feedback_implementations/stock.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Stock Feedback Functions

## Classification-based

### 🤗 Huggingface

API Reference: [Huggingface][trulens.providers.huggingface.Huggingface].

### OpenAI

API Reference: [OpenAI][trulens.providers.openai.OpenAI].

## Generation-based: LLMProvider

API Reference: [LLMProvider][trulens.feedback.LLMProvider].

## Embedding-based

API Reference: [Embeddings][trulens.feedback.embeddings].

## Combinations

### Ground Truth Agreement

API Reference: [GroundTruthAgreement][trulens.feedback.groundtruth]
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

TruLens constructs feedback functions by combining more general models, known as
the [**_feedback provider_**][trulens.core.feedback.Provider], and
[**_feedback implementation_**](../feedback_implementations/index.md) made up of
[**_feedback implementation_**](./feedback_implementations/index.md) made up of
carefully constructed prompts and custom logic tailored to perform a particular
evaluation task.

Expand Down
Loading

0 comments on commit df64fec

Please sign in to comment.