Merge branch 'main' into cache_regexfsm_init

dottxt-ai · Jan 10, 2024 · 542d30b · 542d30b
2 parents aeef8cf + 1916392
commit 542d30b
Show file tree

Hide file tree

Showing 24 changed files with 757 additions and 259 deletions.
diff --git a/.gitignore b/.gitignore
@@ -3,3 +3,4 @@ __pycache__
 *_version.py
 docs/build
 .coverage
+.idea/
diff --git a/docs/cookbook/index.md b/docs/cookbook/index.md
@@ -1,6 +1,6 @@
 # Examples
 
-- [Classification](classification): Classify customer requests.
+- [Classification](classification.md): Classify customer requests.
 - [Named Entity Extraction](extraction.md): Extract information from pizza orders.
 - [Dating Profile](dating_profiles.md): Build dating profiles from descriptions using prompt templating and JSON-guided generation.
 - [Chain Of Density](chain_of_density.md): Summarize documents using chain of density prompting and JSON-guided generation.

diff --git a/docs/cookbook/models_playing_chess.md b/docs/cookbook/models_playing_chess.md
@@ -1,30 +1,35 @@
 # Large language models playing chess
 
-In this example we will make a quantized version of Mistral-7B play chess against itself. On its own the model easily generates invalid move, so we will give it a little help. At each step we will generate a regex that only matches valid move, and use it to help the model only generating valid moves.
+In this example we will make a Phi-2 model play chess against itself. On its own the model easily generates invalid moves, so we will give it a little help. At each step we will generate a regex that only matches valid move, and use it to help the model only generating valid moves.
 
 ## The chessboard
 
 The game will be played on a standard checkboard. We will use the `chess` [library](https://github.com/niklasf/python-chess) to track the opponents' moves, and check that the moves are valid.
 
 ```python
+%pip install outlines -q
+%pip install chess -q
+%pip install transformers accelerate einops -q
+
 import chess
 
 board = chess.Board("rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1")
 ```
 
 ## The opponents
 
-Mistral-7B quantized will be playing against itself:
+Phi-2 will be playing against itself:
 
 ```python
 from outlines import models
 
-board_state = models.transformers("TheBloke/Mistral-7B-OpenOrca-AWQ", device="cuda")
+model = models.transformers("microsoft/phi-2")
+
 ```
 
 ## A little help for the language model
 
-To make sure Mistral-7B generates valid chess moves we will use Outline's regex-guided generation. We define a function that takes the current state of the board and returns a regex that matches all possible legal moves:
+To make sure Phi-2 generates valid chess moves we will use Outline's regex-guided generation. We define a function that takes the current state of the board and returns a regex that matches all possible legal moves:
 
 ```python
 import re
@@ -44,22 +49,22 @@ def legal_moves_regex(board):
 The prompt corresponds to the current state of the board, so we start with:
 
 ```python
-prompt = "Score: 1-0 WhiteElo: 1600 BlackElo: 1600 Timecontrol: 1800+0 Moves: 1."
+prompt = "Let's play Chess. Moves: "
+
 ```
 
 We update the prompt at each step so it reflects the state of the board after the previous move.
 
-## Let's play!
-
+## Let's play
 
 ```python
 from outlines import generate
 
-
+board_state = " "
 turn_number = 0
 while not board.is_game_over():
     regex_pattern = legal_moves_regex(board)
-    guided = generate.regex(model, regex_pattern)(board_state)
+    guided = generate.regex(model, regex_pattern)(prompt + board_state)
     move = board.parse_san(guided)
 
     if turn_number % 2 == 0 :  # It's White's turn
@@ -74,7 +79,10 @@ while not board.is_game_over():
     print(board_state)
 ```
 
-It turns out Mistal-7B (quantized) is not very good at playing chess: the game systematically ends because of the threefold repetition rule.
+Interestingly enough, Phi-2 hates capturing.
 
+```pgn
+ e4 e5 1.Nf3 Ne7 3.b4 Nf5 5.Nc3 Ne7 7.Bb5 a6 9.Na4 b6 11.c3 Nec6 13.c4 a5 15.d4 Qg5 17.Nd2 Bb7 19.dxe5
+```
 
 *This example was originally authored by [@903124S](https://x.com/903124S) in [this gist](https://gist.github.com/903124/cfbefa24da95e2316e0d5e8ef8ed360d).*
diff --git a/docs/reference/models/llamacpp.md b/docs/reference/models/llamacpp.md
@@ -0,0 +1,15 @@
+# Llama.cpp
+
+!!! Installation
+
+    You need to install the `llama-cpp-python` library to be able to use these models in Outlines.
+
+Outlines provides an integration with [Llama.cpp](https://github.com/ggerganov/llama.cpp) using the [llama-cpp-python library](https://github.com/abetlen/llama-cpp-python). Llamacpp allows to run quantized models on machines with limited compute.
+
+Assuming [Phi2's weights](https://huggingface.co/TheBloke/phi-2-GGUF) are in the current directory:
+
+```python
+from outlines import models, generate
+
+model = models.llamacpp("./phi-2.Q4_K_M.gguf", device="cpu")
+```
diff --git a/docs/reference/openai_text_generation.md → docs/reference/models/openai.md b/docs/reference/openai_text_generation.md → docs/reference/models/openai.md
@@ -1,5 +1,9 @@
 # Generate text with the OpenAI API
 
+!!! Installation
+
+    You need to install the `openai` and `tiktoken` libraries to be able to use the OpenAI API in Outlines.
+
 Outlines supports models available via the OpenAI Chat API, e.g. ChatGPT and GPT-4. The following models can be used with Outlines:
 
 ```python
@@ -12,6 +16,7 @@ print(type(model))
 # OpenAI
 ```
 
+
 It is possible to pass a system message to the model when initializing it:
 
 ```python

diff --git a/docs/reference/vllm.md b/docs/reference/vllm.md
@@ -49,9 +49,7 @@ curl http://127.0.0.1:8000/generate \
 
 Instead of `curl`, you can also use the [requests][requests]{:target="_blank"} library from another python program.
 
-Please consult the [vLLM documentation][vllm]{:target="_blank"} for details on additional request parameters.
-
-You can also [read the code](https://github.com/outlines-dev/outlines/blob/main/outlines/serve/serve.py) in case you need to customize the solution to your needs.
+Please consult the [vLLM documentation][vllm]{:target="_blank"} for details on additional request parameters. You can also [read the code](https://github.com/outlines-dev/outlines/blob/main/outlines/serve/serve.py) in case you need to customize the solution to your needs.
 
 [requests]: https://requests.readthedocs.io/en/latest/
 [vllm]: https://docs.vllm.ai/en/latest/index.html
diff --git a/examples/cfg.py b/examples/cfg.py
@@ -1,5 +1,3 @@
-from lark.exceptions import UnexpectedCharacters, UnexpectedToken
-
 import outlines.generate as generate
 import outlines.models as models
 
@@ -49,19 +47,13 @@
 
 model = models.transformers("hf-internal-testing/tiny-random-gpt2")
 batch_size = 10
-max_tokens = 30
 for grammar in [nlamb_grammar, calc_grammar]:
     generator = generate.cfg(model, grammar)
-    sequences = generator([" "] * batch_size, max_tokens=max_tokens)
+    sequences = generator([" "] * batch_size)
     for seq in sequences:
         try:
             parse = generator.fsm.parser.parse(seq)
             assert parse is not None
             print("SUCCESS", seq)
-        except (UnexpectedCharacters, UnexpectedToken):
-            if generator.fsm.num_tokens_generated == max_tokens:
-                print("MAXTOKEN", seq)
-            else:
-                print("FAILURE", seq)
         except Exception:
             print("FAILURE", seq)
diff --git a/examples/llamacpp_example.py b/examples/llamacpp_example.py
@@ -0,0 +1,46 @@
+from enum import Enum
+
+import torch
+from pydantic import BaseModel, constr
+
+import outlines
+
+
+class Weapon(str, Enum):
+    sword = "sword"
+    axe = "axe"
+    mace = "mace"
+    spear = "spear"
+    bow = "bow"
+    crossbow = "crossbow"
+
+
+class Armor(str, Enum):
+    leather = "leather"
+    chainmail = "chainmail"
+    plate = "plate"
+
+
+class Character(BaseModel):
+    name: constr(max_length=10)
+    age: int
+    armor: Armor
+    weapon: Weapon
+    strength: int
+
+
+if __name__ == "__main__":
+    # Download model from https://huggingface.co/TheBloke/phi-2-GGUF
+    model = outlines.models.llamacpp("./phi-2.Q3_K_M.gguf", device="cpu")
+
+    # Construct guided sequence generator
+    generator = outlines.generate.json(model, Character, max_tokens=512)
+
+    # Draw a sample
+    rng = torch.Generator(device="cpu")
+    rng.manual_seed(789005)
+
+    prompt = "Instruct: You are a leading role play gamer. You have seen thousands of different characters and their attributes.\nPlease return a JSON object with common attributes of an RPG character. Give me a character description\nOutput:"
+
+    sequence = generator(prompt, rng=rng)
+    print(sequence)
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -126,7 +126,8 @@ nav:
         - Prompt templating: reference/prompting.md
         - Outlines functions: reference/functions.md
     - Models:
-        - OpenAI: reference/openai_text_generation.md
+        - OpenAI: reference/models/openai.md
+        - Llama.cpp: reference/models/llamacpp.md
 
   - API Reference:
       - api/index.md

diff --git a/outlines/__init__.py b/outlines/__init__.py
@@ -11,6 +11,7 @@
     "clear_cache",
     "disable_cache",
     "get_cache",
+    "Function",
     "prompt",
     "vectorize",
 ]