Add blog site and docs grouping (#1499)

* move gtag to head * fix src docstrings and links * restructure docs and add blog * hide watermark for demo * format blog posts and redirect * rm links * add assets to lfs * update paths
truera · Sep 25, 2024 · df64fec · df64fec
1 parent cf01a93
commit df64fec
Show file tree

Hide file tree

Showing 176 changed files with 387 additions and 534 deletions.
diff --git a/.gitattributes b/.gitattributes
@@ -0,0 +1,3 @@
+*.png filter=lfs diff=lfs merge=lfs -text
+*.jpg filter=lfs diff=lfs merge=lfs -text
+*.gif filter=lfs diff=lfs merge=lfs -text
diff --git a/POLICIES.md b/POLICIES.md
@@ -34,13 +34,13 @@ will occur at the introduction of the warning period.
 
 - Starting 1.0, the `trulens_eval` package is being deprecated in favor of
   `trulens` and several associated required and optional packages. See
-  [trulens_eval migration](/trulens/guides/trulens_eval_migration) for details.
+  [trulens_eval migration](/component_guides/other/trulens_eval_migration/) for details.
 
     - Warning period: 2024-09-01 (`trulens-eval==1.0.1`) to 2024-10-14.
     Backwards compatibility during the warning period is provided by the new
     content of the `trulens_eval` package which provides aliases to the features
     in their new locations. See
-    [trulens_eval](trulens/api/trulens_eval/index.md).
+    [trulens_eval](/reference/trulens_eval/index.md).
 
     - Deprecated period: 2024-10-14 to 2025-12-01. Usage of `trulens_eval` will
   	produce errors indicating deprecation.

diff --git a/README.md b/README.md
@@ -3,7 +3,7 @@
 ![GitHub](https://img.shields.io/github/license/truera/trulens)
 ![PyPI - Downloads](https://img.shields.io/pypi/dm/trulens)
 [![Slack](https://img.shields.io/badge/slack-join-green?logo=slack)](https://communityinviter.com/apps/aiqualityforum/josh)
-[![Docs](https://img.shields.io/badge/docs-trulens.org-blue)](https://www.trulens.org/trulens/getting_started/)
+[![Docs](https://img.shields.io/badge/docs-trulens.org-blue)](https://www.trulens.org/getting_started/)
 [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/truera/trulens/blob/main/examples/quickstart/langchain_quickstart.ipynb)
 
 # 🦑 Welcome to TruLens!
@@ -19,9 +19,9 @@ Fine-grained, stack-agnostic instrumentation and comprehensive evaluations help
 you to identify failure modes & systematically iterate to improve your
 application.
 
-Read more about the core concepts behind TruLens including [Feedback Functions](https://www.trulens.org/trulens/getting_started/core_concepts/feedback_functions/),
-[The RAG Triad](https://www.trulens.org/trulens/getting_started/core_concepts/rag_triad/),
-and [Honest, Harmless and Helpful Evals](https://www.trulens.org/trulens/getting_started/core_concepts/honest_harmless_helpful_evals/).
+Read more about the core concepts behind TruLens including [Feedback Functions](https://www.trulens.org/getting_started/core_concepts/feedback_functions/),
+[The RAG Triad](https://www.trulens.org/getting_started/core_concepts/rag_triad/),
+and [Honest, Harmless and Helpful Evals](https://www.trulens.org/getting_started/core_concepts/honest_harmless_helpful_evals/).
 
 ## TruLens in the development workflow
 

diff --git a/docs/assets/favicon/android-chrome-192x192.png b/docs/assets/favicon/android-chrome-192x192.png
diff --git a/docs/assets/favicon/android-chrome-512x512.png b/docs/assets/favicon/android-chrome-512x512.png
diff --git a/docs/assets/favicon/apple-touch-icon.png b/docs/assets/favicon/apple-touch-icon.png
diff --git a/docs/assets/favicon/favicon-16x16.png b/docs/assets/favicon/favicon-16x16.png
diff --git a/docs/assets/favicon/favicon-32x32.png b/docs/assets/favicon/favicon-32x32.png
diff --git a/docs/assets/favicon/mstile-144x144.png b/docs/assets/favicon/mstile-144x144.png
diff --git a/docs/assets/favicon/mstile-150x150.png b/docs/assets/favicon/mstile-150x150.png
diff --git a/docs/assets/favicon/mstile-310x150.png b/docs/assets/favicon/mstile-310x150.png
diff --git a/docs/assets/favicon/mstile-310x310.png b/docs/assets/favicon/mstile-310x310.png
diff --git a/docs/assets/favicon/mstile-70x70.png b/docs/assets/favicon/mstile-70x70.png
diff --git a/docs/assets/images/Chain_Explore.png b/docs/assets/images/Chain_Explore.png
diff --git a/docs/assets/images/Evaluations.png b/docs/assets/images/Evaluations.png
diff --git a/docs/assets/images/Honest_Harmless_Helpful_Evals.png b/docs/assets/images/Honest_Harmless_Helpful_Evals.png
diff --git a/docs/assets/images/Leaderboard.png b/docs/assets/images/Leaderboard.png
diff --git a/docs/assets/images/Neural_Network_Explainability.png b/docs/assets/images/Neural_Network_Explainability.png
diff --git a/docs/assets/images/RAG_Triad.png b/docs/assets/images/RAG_Triad.png
diff --git a/docs/assets/images/Range_of_Feedback_Functions.png b/docs/assets/images/Range_of_Feedback_Functions.png
diff --git a/docs/assets/images/TruLens_Architecture.png b/docs/assets/images/TruLens_Architecture.png
diff --git a/docs/assets/images/appui/apps.png b/docs/assets/images/appui/apps.png
diff --git a/docs/assets/images/appui/blank_session.png b/docs/assets/images/appui/blank_session.png
diff --git a/docs/assets/images/appui/running_session.png b/docs/assets/images/appui/running_session.png
diff --git a/docs/assets/images/trulens_1_release_graphic_modular.png b/docs/assets/images/trulens_1_release_graphic_modular.png
diff --git a/docs/assets/images/trulens_1_release_graphic_split.png b/docs/assets/images/trulens_1_release_graphic_split.png
diff --git a/docs/blog/assets/trulens_1_1_dashboard_updates/compare_app.png b/docs/blog/assets/trulens_1_1_dashboard_updates/compare_app.png
diff --git a/docs/blog/assets/trulens_1_1_dashboard_updates/compare_record.png b/docs/blog/assets/trulens_1_1_dashboard_updates/compare_record.png
diff --git a/docs/blog/assets/trulens_1_1_dashboard_updates/dashboard_global_features.gif b/docs/blog/assets/trulens_1_1_dashboard_updates/dashboard_global_features.gif
diff --git a/docs/blog/assets/trulens_1_1_dashboard_updates/leaderboard_metadata.gif b/docs/blog/assets/trulens_1_1_dashboard_updates/leaderboard_metadata.gif
diff --git a/docs/blog/assets/trulens_1_1_dashboard_updates/record_page.gif b/docs/blog/assets/trulens_1_1_dashboard_updates/record_page.gif
diff --git a/docs/blog/index.md b/docs/blog/index.md
@@ -0,0 +1 @@
+# Blog
diff --git a/docs/trulens/release_blog_1dot.md → docs/blog/posts/release_blog_1dot.md b/docs/trulens/release_blog_1dot.md → docs/blog/posts/release_blog_1dot.md
@@ -1,13 +1,23 @@
+---
+categories:
+  - General
+date: 2024-08-30
+---
+
+
 # Moving to TruLens v1: Reliable and Modular Logging and Evaluation
 
-It has always been our goal to make it easy to build trustworthy LLM applications. Since we launched last May, the package has grown up before our eyes, morphing from a hacked-together addition to an existing project (`trulens-explain`) to a thriving, agnostic standard for tracking and evaluating LLM apps. Along the way, we’ve experienced growing pains and discovered inefficiencies in the way TruLens was built. We’ve also heard that the reasons people use TruLens today are diverse, and many of its use cases do not require its full footprint. Today we’re announcing an extensive re-architecture of TruLens that aims to give developers a stable, modular platform for logging and evaluation they can rely on.
+It has always been our goal to make it easy to build trustworthy LLM applications. Since we launched last May, the package has grown up before our eyes, morphing from a hacked-together addition to an existing project (`trulens-explain`) to a thriving, agnostic standard for tracking and evaluating LLM apps. Along the way, we’ve experienced growing pains and discovered inefficiencies in the way TruLens was built. We’ve also heard that the reasons people use TruLens today are diverse, and many of its use cases do not require its full footprint.
+
+Today we’re announcing an extensive re-architecture of TruLens that aims to give developers a stable, modular platform for logging and evaluation they can rely on.
 
+<!-- more -->
 
 ## **Split off trulens-eval from trulens-explain**
 
 Split off `trulens-eval` from `trulens-explain`, and let `trulens-eval` take over the `trulens` package name. _TruLens-Eval_ is now renamed to _TruLens_ and sits at the root of the [_TruLens_ repo](https://github.com/truera/trulens), while _TruLens-Explain_ has been moved to its own [repository](https://github.com/truera/trulens_explain), and is installable at `trulens-explain`.
 
-![TruLens 1.0 Release Graphics](../assets/images/trulens_1_release_graphic_split.png)
+![TruLens 1.0 Release Graphics](../../assets/images/trulens_1_release_graphic_split.png)
 
 ## **Separate TruLens-Eval into different trulens packages**
 
@@ -20,7 +30,7 @@ Next, we modularized _TruLens_ into a family of different packages, described be
 * `trulens-providers-` prefixed package describes a set of integrations with other libraries for running feedback functions. Today, we offer an extensive set of integrations that allow you to run feedback functions on top of virtually any LLM. These integrations can be installed as standalone packages, and include: `trulens-providers-openai`, `trulens-providers-huggingface`, `trulens-providers-litellm`, `trulens-providers-langchain`, `trulens-providers-bedrock`, `trulens-providers-cortex`.
 * `trulens-connectors-` provide ways to log _TruLens_ traces and evaluations to other databases. In addition to connect to any `sqlalchemy` database with `trulens-core`, we've added with `trulens-connectors-snowflake` tailored specifically to connecting to Snowflake. We plan to add more connectors over time.
 
-![TruLens 1.0 Release Graphics](../assets/images/trulens_1_release_graphic_modular.png)
+![TruLens 1.0 Release Graphics](../../assets/images/trulens_1_release_graphic_modular.png)
 
 ## **Versioning and Backwards Compatibility**
 
@@ -30,13 +40,13 @@ The base install of `trulens` will install `trulens-core`, `trulens-feedback` an
 
 Starting 1.0, the `trulens_eval` package is being deprecated in favor of `trulens` and several associated required and optional packages.
 
-Until 2024-10-14, backwards compatibility during the warning period is provided by the new content of the `trulens_eval` package which provides aliases to the in their new locations. See [trulens_eval](./api/trulens_eval/index.md).
+Until 2024-10-14, backwards compatibility during the warning period is provided by the new content of the `trulens_eval` package which provides aliases to the in their new locations. See [trulens_eval](../../reference/trulens_eval/index.md).
 
 Starting 2024-10-15 until 2025-12-01. Usage of `trulens_eval` will produce errors indicating deprecation.
 
 Beginning 2024-12-01 Installation of the latest version of `trulens_eval` will be an error itself with a message that `trulens_eval` is no longer maintained.
 
-Along with this change, we’ve also included a [migration guide](./guides/trulens_eval_migration.md) for moving to TruLens v1.
+Along with this change, we’ve also included a [migration guide](../../component_guides/other/trulens_eval_migration.md) for moving to TruLens v1.
 
 Please give us feedback on GitHub by creating [issues](https://github.com/truera/trulens/issues) and starting [discussions](https://github.com/truera/trulens/discussions). You can also chime in on [slack](https://communityinviter.com/apps/aiqualityforum/josh).
 
@@ -272,7 +282,7 @@ To bring these changes to life, we've also added new filters to the Leaderboard
 
 ## **First-class support for Ground Truth Evaluation**
 
-Along with the high level changes in TruLens v1, ground truth can now be persisted in SQL-compatible datastores and loaded on demand as pandas dataframe objects in memory as required. By enabling the persistence of ground truth data, you can now easily store and share ground truth data used across your team.
+Along with the high level changes in TruLens v1, ground truth can now be persisted in SQL-compatible datastores and loaded on demand as pandas DataFrame objects in memory as required. By enabling the persistence of ground truth data, you can now easily store and share ground truth data used across your team.
 
 !!! example "Using Ground Truth Data"
 
@@ -319,22 +329,22 @@ Along with the high level changes in TruLens v1, ground truth can now be persist
         ).on_input_output()
         ```
 
-See this in action in the new [Ground Truth Persistence Quickstart](./getting_started/quickstarts/groundtruth_dataset_persistence.ipynb)
+See this in action in the new [Ground Truth Persistence Quickstart](../../getting_started/quickstarts/groundtruth_dataset_persistence.ipynb)
 
 ## **New Component Guides and TruLens Cookbook**
 
-On the top-level of TruLens docs, we previously had separated out Evaluation, Evaluation Benchmarks, Tracking and Guardrails. These are now combined to form the new [Component Guides](../tracking/instrumentation/).
+On the top-level of TruLens docs, we previously had separated out Evaluation, Evaluation Benchmarks, Tracking and Guardrails. These are now combined to form the new [Component Guides](../../component_guides/index.md).
 
-We also pulled in our extensive GitHub examples library directly into docs. This should make it easier for you to learn about all of the different ways to get started using TruLens. You can find these examples in the top-level navigation under ["Cookbook"](../../examples/).
+We also pulled in our extensive GitHub examples library directly into docs. This should make it easier for you to learn about all of the different ways to get started using TruLens. You can find these examples in the top-level navigation under ["Cookbook"](../../cookbook/index.md).
 
 ## **Automatic Migration with Grit**
 
-To assist you in migrating your codebase to _TruLens_ to v1.0, we've published a `grit` pattern. You can migrade your codebase [online](https://docs.grit.io/patterns/library/trulens_eval_migration#migrate-and-use-tru-session), or by using `grit` on the command line.
+To assist you in migrating your codebase to _TruLens_ to v1.0, we've published a `grit` pattern. You can migrate your codebase [online](https://docs.grit.io/patterns/library/trulens_eval_migration#migrate-and-use-tru-session), or by using `grit` on the command line.
 
-Read more detailed instructions in our [migration guide](./guides/trulens_eval_migration.md)
+Read more detailed instructions in our [migration guide](../../component_guides/other/trulens_eval_migration.md)
 
 Be sure to audit its changes: we suggest ensuring you have a clean working tree beforehand.
 
 ## **Conclusion**
 
-Ready to get started with the v1 stable release of TruLens? Check out our [migration guide](./guides/trulens_eval_migration.md), or just jump in to the [quickstart](./getting_started/quickstarts/quickstart.ipynb)!
+Ready to get started with the v1 stable release of TruLens? Check out our [migration guide](../../component_guides/other/trulens_eval_migration.md), or just jump in to the [quickstart](../../getting_started/quickstarts/quickstart.ipynb)!
diff --git a/docs/blog/posts/trulens_1_1_dashboard_updates.md b/docs/blog/posts/trulens_1_1_dashboard_updates.md
@@ -0,0 +1,96 @@
+---
+categories:
+  - General
+date: 2024-09-25
+---
+
+# What's new in TruLens 1.1: Dashboard Comparison View, Multi-App Support, Metadata Editing, and More!
+
+TruLens 1.1.0 has been released! This release includes a number of improvements to the TruLens dashboard, including a new comparison view and a more intuitive user interface. We have also made several improvements performance and usability.
+
+<!-- more -->
+
+## Dashboard Highlights
+
+An overhaul of the TruLens dashboard has been released with major features and improvements. Here are some of the highlights:
+
+### Global Enhancements
+
+![The New TruLens Dashboard](../assets/trulens_1_1_dashboard_updates/dashboard_global_features.gif)
+
+#### Global app selector
+
+TruLens 1.0 introduced app versioning, allowing performance of their LLM apps to be tracked across different versions. On multi-app tables, the dashboard sidebar now includes an app selector to quickly navigate to the desired application.
+
+#### App version and Record search and filtering
+
+All pages in the dashboard now include relevant search and filter options to identify app versions and records quickly. The search bar allows filtering records and app versions by name or by other metadata fields. This makes it easy to find specific records or applications and compare their performance over time.
+
+#### Performance enhancements
+
+TruLens 1.1.0 includes several performance enhancements to improve the scalability and speed of the dashboard. The dashboard now queries only the most recent records unless specified otherwise. This helps prevent out-of-memory errors and improves the overall performance of the dashboard.
+
+Furthermore, all record and app data is now cached locally, reducing network latency on refreshes. This results in faster load times and a more responsive user experience. The cache is cleared automatically every 15 minutes or manually with the new `Refresh Data` button.
+
+### Leaderboard
+
+![Leaderboard enhancements](../assets/trulens_1_1_dashboard_updates/leaderboard_metadata.gif)
+
+The leaderboard is now displayed in a tabular format, with each row representing a different application version. The grid data can be sorted and filtered.
+
+#### App Version Pinning
+
+App versions can now be pinned to the top of the leaderboard for easy access. This makes it easy to track the performance of specific versions over time. Pinned versions are highlighted for easy identification and can be filtered to with a toggle.
+
+#### Metadata Editing
+
+To better identify and track application versions, app metadata visibility is a central part of this leaderboard update. In addition to being displayed on the leaderboard, metadata fields are now editable after ingestion by double-clicking the cell, or bulk selecting and choosing the `Add/Edit Metadata` option. In addition, new fields can be added with the `Add/Edit Metadata` button.
+
+A selector at the top of the leaderboard allows toggling which app metadata fields are displayed to better customize the view.
+
+#### Virtual App Creation
+
+To bring in evaluation data from a non-TruLens app (e.g another runtime environment or benchmark by a third-party source), the `Add Virtual App` button has been added to the leaderboard! This creates a virtual app with user-defined metadata fields and evaluation data that can be used in the leaderboard and comparison view.
+
+### Comparison View
+
+This update introduces a brand-new comparison page that enables the comparison of up to 5 different app versions side by side.
+
+#### App-level comparison
+
+![App-level comparison](../assets/trulens_1_1_dashboard_updates/compare_app.png)
+
+The comparison view allows performance comparisons across different app versions side by side. The aggregate feedback function results for each app version is plotted across each of the shared feedback functions, making it easy to see how the performance  has changed.
+
+#### Record-level comparison
+
+![Record-level comparison](../assets/trulens_1_1_dashboard_updates/compare_record.png)
+
+To deep dive into the performance of individual records, the comparison view also allows comparison of overlapping records side by side. The dashboard computes a diff or variance score (depending on the number of apps compared against) to identify interesting or anomalous records which have the most significant performance differences. In addition to viewing the distribution of feedback scores, this page also displays the trace data of each record side by side.
+
+### Records Page
+
+![Records Page Flow](../assets/trulens_1_1_dashboard_updates/record_page.gif)
+
+The records page has been updated to include a more intuitive flow for viewing and comparing records. The page now includes a search bar to quickly find specific records as well as matching app metadata filters.
+
+#### Additional features
+
+- URL serialization of key dashboard states
+- Dark mode
+- Improved error handling
+- Fragmented rendering
+
+
+#### Try it out!
+
+We hope you enjoy the new features and improvements in TruLens 1.1.0! To get started, use [`run_dashboard`][trulens.dashboard.run.run_dashboard] with a TruSession object:
+
+
+```python
+from trulens.core import TruSession
+from trulens.dashboard import run_dashboard
+
+session = TruSession(...)
+run_dashboard(session)
+```
diff --git a/.../evaluation/feedback_aggregation/index.md → ...guides/evaluation/feedback_aggregation.md b/.../evaluation/feedback_aggregation/index.md → ...guides/evaluation/feedback_aggregation.md
diff --git a/.../evaluation/feedback_functions/anatomy.md → ...ent_guides/evaluation/feedback_anatomy.md b/.../evaluation/feedback_functions/anatomy.md → ...ent_guides/evaluation/feedback_anatomy.md
@@ -46,7 +46,7 @@ That is,
 is a plain python method that accepts the prompt and context, both strings, and
 produces a float (assumed to be between 0.0 and 1.0).
 
-Read more about [feedback implementations](../feedback_implementations/index.md)
+Read more about [feedback implementations](./feedback_implementations/index.md)
 
 ## Feedback constructor
 
@@ -70,8 +70,8 @@ states that the first two argument to
 respectively.
 
 Read more about [argument
-specification](../feedback_selectors/selecting_components.md) and [selector
-shortcuts](../feedback_selectors/selector_shortcuts.md).
+specification](./feedback_selectors/selecting_components.md) and [selector
+shortcuts](./feedback_selectors/selector_shortcuts.md).
 
 ## Aggregation specification
 
@@ -85,4 +85,4 @@ next section. This function is called on the `float` results of feedback
 function evaluations to produce a single float. The default is
 [numpy.mean][numpy.mean].
 
-Read more about [feedback aggregation](../feedback_aggregation/index.md).
+Read more about [feedback aggregation](feedback_aggregation.md).
diff --git a/...entations/custom_feedback_functions.ipynb → ...entations/custom_feedback_functions.ipynb b/...entations/custom_feedback_functions.ipynb → ...entations/custom_feedback_functions.ipynb
diff --git a/...luation/feedback_implementations/index.md → ...luation/feedback_implementations/index.md b/...luation/feedback_implementations/index.md → ...luation/feedback_implementations/index.md
@@ -1,6 +1,6 @@
 # Feedback Implementations
 
-TruLens constructs feedback functions by a [**_feedback provider_**][trulens.core.feedback.Provider], and [**_feedback implementation_**](../feedback_implementations/index.md).
+TruLens constructs feedback functions by a [**_feedback provider_**][trulens.core.feedback.Provider], and **_feedback implementation_**.
 
 This page documents the feedback implementations available in _TruLens_.
 

diff --git a/docs/component_guides/evaluation/feedback_implementations/stock.md b/docs/component_guides/evaluation/feedback_implementations/stock.md
@@ -0,0 +1,25 @@
+# Stock Feedback Functions
+
+## Classification-based
+
+### 🤗 Huggingface
+
+API Reference: [Huggingface][trulens.providers.huggingface.Huggingface].
+
+### OpenAI
+
+API Reference: [OpenAI][trulens.providers.openai.OpenAI].
+
+## Generation-based: LLMProvider
+
+API Reference: [LLMProvider][trulens.feedback.LLMProvider].
+
+## Embedding-based
+
+API Reference: [Embeddings][trulens.feedback.embeddings].
+
+## Combinations
+
+### Ground Truth Agreement
+
+API Reference: [GroundTruthAgreement][trulens.feedback.groundtruth]
diff --git a/...ns/evaluation/feedback_providers/index.md → ...t_guides/evaluation/feedback_providers.md b/...ns/evaluation/feedback_providers/index.md → ...t_guides/evaluation/feedback_providers.md
@@ -2,7 +2,7 @@
 
 TruLens constructs feedback functions by combining more general models, known as
 the [**_feedback provider_**][trulens.core.feedback.Provider], and
-[**_feedback implementation_**](../feedback_implementations/index.md) made up of
+[**_feedback implementation_**](./feedback_implementations/index.md) made up of
 carefully constructed prompts and custom logic tailored to perform a particular
 evaluation task.