Skip to content

Commit

Permalink
Merge branch 'main' into cache_regexfsm_init
Browse files Browse the repository at this point in the history
  • Loading branch information
RobinPicard authored Jan 10, 2024
2 parents aeef8cf + 1916392 commit 542d30b
Show file tree
Hide file tree
Showing 24 changed files with 757 additions and 259 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,4 @@ __pycache__
*_version.py
docs/build
.coverage
.idea/
2 changes: 1 addition & 1 deletion docs/cookbook/index.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Examples

- [Classification](classification): Classify customer requests.
- [Classification](classification.md): Classify customer requests.
- [Named Entity Extraction](extraction.md): Extract information from pizza orders.
- [Dating Profile](dating_profiles.md): Build dating profiles from descriptions using prompt templating and JSON-guided generation.
- [Chain Of Density](chain_of_density.md): Summarize documents using chain of density prompting and JSON-guided generation.
Expand Down
28 changes: 18 additions & 10 deletions docs/cookbook/models_playing_chess.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,35 @@
# Large language models playing chess

In this example we will make a quantized version of Mistral-7B play chess against itself. On its own the model easily generates invalid move, so we will give it a little help. At each step we will generate a regex that only matches valid move, and use it to help the model only generating valid moves.
In this example we will make a Phi-2 model play chess against itself. On its own the model easily generates invalid moves, so we will give it a little help. At each step we will generate a regex that only matches valid move, and use it to help the model only generating valid moves.

## The chessboard

The game will be played on a standard checkboard. We will use the `chess` [library](https://github.com/niklasf/python-chess) to track the opponents' moves, and check that the moves are valid.

```python
%pip install outlines -q
%pip install chess -q
%pip install transformers accelerate einops -q

import chess

board = chess.Board("rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1")
```

## The opponents

Mistral-7B quantized will be playing against itself:
Phi-2 will be playing against itself:

```python
from outlines import models

board_state = models.transformers("TheBloke/Mistral-7B-OpenOrca-AWQ", device="cuda")
model = models.transformers("microsoft/phi-2")

```

## A little help for the language model

To make sure Mistral-7B generates valid chess moves we will use Outline's regex-guided generation. We define a function that takes the current state of the board and returns a regex that matches all possible legal moves:
To make sure Phi-2 generates valid chess moves we will use Outline's regex-guided generation. We define a function that takes the current state of the board and returns a regex that matches all possible legal moves:

```python
import re
Expand All @@ -44,22 +49,22 @@ def legal_moves_regex(board):
The prompt corresponds to the current state of the board, so we start with:

```python
prompt = "Score: 1-0 WhiteElo: 1600 BlackElo: 1600 Timecontrol: 1800+0 Moves: 1."
prompt = "Let's play Chess. Moves: "

```

We update the prompt at each step so it reflects the state of the board after the previous move.

## Let's play!

## Let's play

```python
from outlines import generate


board_state = " "
turn_number = 0
while not board.is_game_over():
regex_pattern = legal_moves_regex(board)
guided = generate.regex(model, regex_pattern)(board_state)
guided = generate.regex(model, regex_pattern)(prompt + board_state)
move = board.parse_san(guided)

if turn_number % 2 == 0 : # It's White's turn
Expand All @@ -74,7 +79,10 @@ while not board.is_game_over():
print(board_state)
```

It turns out Mistal-7B (quantized) is not very good at playing chess: the game systematically ends because of the threefold repetition rule.
Interestingly enough, Phi-2 hates capturing.

```pgn
e4 e5 1.Nf3 Ne7 3.b4 Nf5 5.Nc3 Ne7 7.Bb5 a6 9.Na4 b6 11.c3 Nec6 13.c4 a5 15.d4 Qg5 17.Nd2 Bb7 19.dxe5
```

*This example was originally authored by [@903124S](https://x.com/903124S) in [this gist](https://gist.github.com/903124/cfbefa24da95e2316e0d5e8ef8ed360d).*
15 changes: 15 additions & 0 deletions docs/reference/models/llamacpp.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Llama.cpp

!!! Installation

You need to install the `llama-cpp-python` library to be able to use these models in Outlines.

Outlines provides an integration with [Llama.cpp](https://github.com/ggerganov/llama.cpp) using the [llama-cpp-python library](https://github.com/abetlen/llama-cpp-python). Llamacpp allows to run quantized models on machines with limited compute.

Assuming [Phi2's weights](https://huggingface.co/TheBloke/phi-2-GGUF) are in the current directory:

```python
from outlines import models, generate

model = models.llamacpp("./phi-2.Q4_K_M.gguf", device="cpu")
```
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# Generate text with the OpenAI API

!!! Installation

You need to install the `openai` and `tiktoken` libraries to be able to use the OpenAI API in Outlines.

Outlines supports models available via the OpenAI Chat API, e.g. ChatGPT and GPT-4. The following models can be used with Outlines:

```python
Expand All @@ -12,6 +16,7 @@ print(type(model))
# OpenAI
```


It is possible to pass a system message to the model when initializing it:

```python
Expand Down
4 changes: 1 addition & 3 deletions docs/reference/vllm.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,9 +49,7 @@ curl http://127.0.0.1:8000/generate \

Instead of `curl`, you can also use the [requests][requests]{:target="_blank"} library from another python program.

Please consult the [vLLM documentation][vllm]{:target="_blank"} for details on additional request parameters.

You can also [read the code](https://github.com/outlines-dev/outlines/blob/main/outlines/serve/serve.py) in case you need to customize the solution to your needs.
Please consult the [vLLM documentation][vllm]{:target="_blank"} for details on additional request parameters. You can also [read the code](https://github.com/outlines-dev/outlines/blob/main/outlines/serve/serve.py) in case you need to customize the solution to your needs.

[requests]: https://requests.readthedocs.io/en/latest/
[vllm]: https://docs.vllm.ai/en/latest/index.html
10 changes: 1 addition & 9 deletions examples/cfg.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
from lark.exceptions import UnexpectedCharacters, UnexpectedToken

import outlines.generate as generate
import outlines.models as models

Expand Down Expand Up @@ -49,19 +47,13 @@

model = models.transformers("hf-internal-testing/tiny-random-gpt2")
batch_size = 10
max_tokens = 30
for grammar in [nlamb_grammar, calc_grammar]:
generator = generate.cfg(model, grammar)
sequences = generator([" "] * batch_size, max_tokens=max_tokens)
sequences = generator([" "] * batch_size)
for seq in sequences:
try:
parse = generator.fsm.parser.parse(seq)
assert parse is not None
print("SUCCESS", seq)
except (UnexpectedCharacters, UnexpectedToken):
if generator.fsm.num_tokens_generated == max_tokens:
print("MAXTOKEN", seq)
else:
print("FAILURE", seq)
except Exception:
print("FAILURE", seq)
46 changes: 46 additions & 0 deletions examples/llamacpp_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
from enum import Enum

import torch
from pydantic import BaseModel, constr

import outlines


class Weapon(str, Enum):
sword = "sword"
axe = "axe"
mace = "mace"
spear = "spear"
bow = "bow"
crossbow = "crossbow"


class Armor(str, Enum):
leather = "leather"
chainmail = "chainmail"
plate = "plate"


class Character(BaseModel):
name: constr(max_length=10)
age: int
armor: Armor
weapon: Weapon
strength: int


if __name__ == "__main__":
# Download model from https://huggingface.co/TheBloke/phi-2-GGUF
model = outlines.models.llamacpp("./phi-2.Q3_K_M.gguf", device="cpu")

# Construct guided sequence generator
generator = outlines.generate.json(model, Character, max_tokens=512)

# Draw a sample
rng = torch.Generator(device="cpu")
rng.manual_seed(789005)

prompt = "Instruct: You are a leading role play gamer. You have seen thousands of different characters and their attributes.\nPlease return a JSON object with common attributes of an RPG character. Give me a character description\nOutput:"

sequence = generator(prompt, rng=rng)
print(sequence)
3 changes: 2 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,8 @@ nav:
- Prompt templating: reference/prompting.md
- Outlines functions: reference/functions.md
- Models:
- OpenAI: reference/openai_text_generation.md
- OpenAI: reference/models/openai.md
- Llama.cpp: reference/models/llamacpp.md

- API Reference:
- api/index.md
Expand Down
1 change: 1 addition & 0 deletions outlines/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
"clear_cache",
"disable_cache",
"get_cache",
"Function",
"prompt",
"vectorize",
]
Loading

0 comments on commit 542d30b

Please sign in to comment.