Merge pull request #1160 from guardrails-ai/feat/llama-index

Feat/llama index
guardrails-ai · Nov 19, 2024 · b765e4d · b765e4d
2 parents 93a6a36 + e0ac7a7
commit b765e4d
Show file tree

Hide file tree

Showing 11 changed files with 1,141 additions and 15 deletions.
diff --git a/docs/.gitignore b/docs/.gitignore
@@ -1,2 +1,4 @@
 /.quarto/
 lib64
+integrations/data
+integrations/storage
diff --git a/docs/examples/llamaindex-output-parsing.ipynb b/docs/examples/llamaindex-output-parsing.ipynb
@@ -14,7 +14,10 @@
    "id": "9c48213d-6e6a-4c10-838a-2a7c710c3a05",
    "metadata": {},
    "source": [
-    "# Guardrails Output Parsing\n"
+    "# Guardrails Output Parsing (Deprecated)\n",
+    "\n",
+    "## DEPRECATION NOTE\n",
+    "This integration between LlamaIndex and Guardrails is only valid for llama-index ~0.9.x and guardrails-ai < 0.5.x. and thus has been deprecated.  For an updated example of using Guardrails with LlamaIndex with their latest versions, see the [GuardrailsEngine](/docs/integrations/llama_index)\n"
    ]
   },
   {

diff --git a/docs/integrations/llama_index.ipynb b/docs/integrations/llama_index.ipynb
@@ -0,0 +1,286 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# LlamaIndex\n",
+    "\n",
+    "## Overview\n",
+    "\n",
+    "This is a Quick Start guide that shows how to use Guardrails alongside LlamaIndex.  As you'll see, the LlamaIndex portion comes directly from their starter examples [here](https://docs.llamaindex.ai/en/stable/getting_started/starter_example/).  Our approach to intergration for LlamaIndex, similar to our LangChain integration, is the make the interaction feel as native to the tool as possible."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Installation\n",
+    "Install LlamaIndex and a version of Guardrails with LlamaIndex support."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Found existing installation: guardrails-ai 0.6.0\n",
+      "Uninstalling guardrails-ai-0.6.0:\n",
+      "  Successfully uninstalled guardrails-ai-0.6.0\n"
+     ]
+    }
+   ],
+   "source": [
+    "! pip uninstall guardrails-ai -y"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "! pip install llama-index -q\n",
+    "# ! pip install \"guardrails-ai>=0.6.1\"\n",
+    "! pip install /Users/calebcourier/Projects/guardrails -q"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Install a couple validators from the Guardrails Hub that we'll use to guard the query outputs."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Installing hub:\u001b[35m/\u001b[0m\u001b[35m/guardrails/\u001b[0m\u001b[95mdetect_pii...\u001b[0m\n",
+      "✅Successfully installed guardrails/detect_pii version \u001b[1;36m0.0\u001b[0m.\u001b[1;36m5\u001b[0m!\n",
+      "\n",
+      "\n",
+      "Installing hub:\u001b[35m/\u001b[0m\u001b[35m/guardrails/\u001b[0m\u001b[95mcompetitor_check...\u001b[0m\n",
+      "✅Successfully installed guardrails/competitor_check version \u001b[1;36m0.0\u001b[0m.\u001b[1;36m1\u001b[0m!\n",
+      "\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "! guardrails hub install hub://guardrails/detect_pii --no-install-local-models -q\n",
+    "! guardrails hub install hub://guardrails/competitor_check --no-install-local-models -q"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Download some sample data from the LlamaIndex docs."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n",
+      "                                 Dload  Upload   Total   Spent    Left  Speed\n",
+      "100 75042  100 75042    0     0   959k      0 --:--:-- --:--:-- --:--:--  964k\n"
+     ]
+    }
+   ],
+   "source": [
+    "! mkdir -p ./data\n",
+    "! curl https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt > ./data/paul_graham_essay.txt"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Index Setup\n",
+    "\n",
+    "First we'll load some data and build an index as shown in the [starter tutorial](https://docs.llamaindex.ai/en/stable/getting_started/starter_example/)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os.path\n",
+    "from llama_index.core import (\n",
+    "    VectorStoreIndex,\n",
+    "    SimpleDirectoryReader,\n",
+    "    StorageContext,\n",
+    "    load_index_from_storage,\n",
+    ")\n",
+    "\n",
+    "# check if storage already exists\n",
+    "PERSIST_DIR = \"./storage\"\n",
+    "if not os.path.exists(PERSIST_DIR):\n",
+    "    # load the documents and create the index\n",
+    "    documents = SimpleDirectoryReader(\"data\").load_data()\n",
+    "    index = VectorStoreIndex.from_documents(documents)\n",
+    "    # store it for later\n",
+    "    index.storage_context.persist(persist_dir=PERSIST_DIR)\n",
+    "else:\n",
+    "    # load the existing index\n",
+    "    storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)\n",
+    "    index = load_index_from_storage(storage_context)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Guard Setup\n",
+    "\n",
+    "Next we'll create our Guard and assign some validators to check the output from our queries."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from guardrails import Guard\n",
+    "from guardrails.hub import CompetitorCheck, DetectPII\n",
+    "\n",
+    "guard = Guard().use(\n",
+    "    CompetitorCheck(\n",
+    "        competitors=[\"Fortran\", \"Ada\", \"Pascal\"],\n",
+    "        on_fail=\"fix\"\n",
+    "    )\n",
+    ").use(DetectPII(pii_entities=\"pii\", on_fail=\"fix\"))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Querying The Index\n",
+    "\n",
+    "To demonstrate it's plug-and-play capabilities, first we'll query the index un-guarded."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "The author worked on writing short stories and programming, starting with early attempts on an IBM 1401 using Fortran in 9th grade, and later transitioning to microcomputers like the TRS-80 and Apple II to write games, rocket prediction programs, and a word processor.\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Use index on it's own\n",
+    "query_engine = index.as_query_engine()\n",
+    "response = query_engine.query(\"What did the author do growing up?\")\n",
+    "print(response)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now we'll set up a guarded engine, and re-query the index to see how Guardrails applies the fixes we specified when assigning our validators to the Guard."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "The author worked on writing short stories and programming, starting with early attempts on an IBM 1401 using [COMPETITOR] in 9th <URL>er, the author transitioned to microcomputers, building a Heathkit kit and eventually getting a TRS-80 to write simple games and <URL>spite enjoying programming, the author initially planned to study philosophy in college but eventually switched to AI due to a lack of interest in philosophy courses.\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Use index with Guardrails\n",
+    "from guardrails.integrations.llama_index import GuardrailsQueryEngine\n",
+    "\n",
+    "guardrails_query_engine = GuardrailsQueryEngine(engine=query_engine, guard=guard)\n",
+    "\n",
+    "response = guardrails_query_engine.query(\"What did the author do growing up?\")\n",
+    "print(response)\n",
+    "    "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The GuardrailsEngine can also be used with LlamaIndex's chat engine, not just the query engine."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "The author worked on writing short stories and programming while growing <URL>ey started with early attempts on an IBM 1401 using [COMPETITOR] in 9th <URL>er, they transitioned to microcomputers, building simple games and a word processor on a TRS-80 in <DATE_TIME>.\n"
+     ]
+    }
+   ],
+   "source": [
+    "# For chat engine\n",
+    "from guardrails.integrations.llama_index import GuardrailsChatEngine\n",
+    "chat_engine = index.as_chat_engine()\n",
+    "guardrails_chat_engine = GuardrailsChatEngine(engine=chat_engine, guard=guard)\n",
+    "\n",
+    "response = guardrails_chat_engine.chat(\"Tell me what the author did growing up.\")\n",
+    "print(response)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": ".venv",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.12.4"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/guardrails/integrations/llama_index/__init__.py b/guardrails/integrations/llama_index/__init__.py
@@ -0,0 +1,8 @@
+from guardrails.integrations.llama_index.guardrails_query_engine import (
+    GuardrailsQueryEngine,
+)
+from guardrails.integrations.llama_index.guardrails_chat_engine import (
+    GuardrailsChatEngine,
+)
+
+__all__ = ["GuardrailsQueryEngine", "GuardrailsChatEngine"]