Skip to content

Commit

Permalink
Remove the AutoAWQ and AutoGPTQ integrations
Browse files Browse the repository at this point in the history
These were made obsolete by their integration in `transformers`.
  • Loading branch information
rlouf committed Jan 11, 2024
1 parent 076bd98 commit 4a76ea2
Show file tree
Hide file tree
Showing 9 changed files with 8 additions and 100 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ First time here? Go to our [setup guide](https://outlines-dev.github.io/outlines

## Features

- [x] 🤖 [Multiple model integrations](https://outlines-dev.github.io/outlines/installation): OpenAI, transformers, AutoGPTQ, AutoAWQ
- [x] 🤖 [Multiple model integrations](https://outlines-dev.github.io/outlines/installation): OpenAI, transformers, llama.cpp, exllama2, mamba
- [x] 🖍️ Simple and powerful prompting primitives based on the [Jinja templating engine](https://jinja.palletsprojects.com/)
- [x] 🚄 [Multiple choices](#multiple-choices), [type constraints](#type-constraint) and dynamic stopping
- [x] ⚡ Fast [regex-guided generation](#efficient-regex-guided-generation)
Expand Down
4 changes: 2 additions & 2 deletions docs/blog/posts/roadmap-2024.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Before delving into [the detailed roadmap](#detailed-roadmap), let me share a fe

*Outlines currently differentiates itself* from other libraries with its efficient JSON- and regex- constrained generation. A user-facing interface for grammar-guided generation (it had been hidden in the repository) was also recently added. But there is much more we can do along these lines. In 2024 will we will keep pushing in the direction of more accurate, faster constrained generation.

Outlines also supports many models providers: `transformers`, `autoawq`, `autogptq`, `mamba`, `llama.cpp` and `exllama2`. Those *integrations represent a lot of maintenance*, and we will need to simplify them. For instance, `transformers` now supports quantized models, and we will soon deprecate the support for `autoawq` and `autogptq`.
Outlines also supports many models providers: `transformers`, `mamba`, `llama.cpp` and `exllama2`. Those *integrations represent a lot of maintenance*, and we will need to simplify them. For instance, `transformers` now supports quantized models, and we will soon deprecate the support for `autoawq` and `autogptq`.
Thanks to a refactor of the library, it is now possible to use our constrained generation method by using logits processor with all other libraries, except `mamba`. We will look for libraries that provide state-space models and allow to pass a logits processor during inference. We will interface with `llama.cpp` and `exllama2` using logits processors.

*We would like expand our work to the whole sampling layer*, and add new sampling methods that should make guided generation more accurate. This means we will keep the `transformers` integration as it is today and will expand our text generation logic around this library.
Expand All @@ -44,7 +44,7 @@ Let's be honest, Outlines is lacking clear and thorough examples. We want to cha

We want to keep the current integrations but lower the maintenance cost so we can focus on what we bring to the table.

* Deprecate every obsolete integration: `transformers` has recently integrated `autoawq` and `autogptq` for instance.
- [x] Deprecate every obsolete integration: `transformers` has recently integrated `autoawq` and `autogptq` for instance. ([PR]())
* Integrate via logits processors as much as we can:
* See if we can integrate to a library that provides state-space models via a logit processing function;
* Integrate with llama.cpp via a logits processor;
Expand Down
2 changes: 1 addition & 1 deletion docs/cookbook/chain_of_density.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ class Summaries(BaseModel):
We now generate the prompt by passing the article we want to summarize to the template. We load a quantized version of Mistral-7B using the AutoAWQ library, and then use JSON-guided generation to generate the summaries:

```python
model = outlines.models.awq("TheBloke/Mistral-7B-OpenOrca-AWQ")
model = outlines.models.transformers("TheBloke/Mistral-7B-OpenOrca-AWQ")

prompt = chain_of_density(article)
result = outlines.generate.json(model, Summaries)(prompt)
Expand Down
24 changes: 3 additions & 21 deletions docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,34 +10,16 @@ You can install Outlines with `pip`:
pip install outlines
```

Outlines supports OpenAI, transformers, Mamba, AutoGPTQ, AutoAWQ but **you will need to install them manually**:
Outlines supports OpenAI, transformers, Mamba, llama.cpp and exllama2 but **you will need to install them manually**:

```python
pip install openai
pip install transformers datasets accelerate
pip install autoawq
pip install auto-gptq
pip install llama-cpp-python
pip install mamba_ssm
```

If you encounter any problem using Outlines with these libraries, take a look at their installation instructions. The installation of `openai` and `transformers` should be straightforward, but other libraries have specific hardware requirements. We summarize them below:

### AutoGPTQ

- `pip install auto-gptq` works with CUDA 12.1
- For CUDA 11.8, see the [documentation](https://github.com/PanQiWei/AutoGPTQ?tab=readme-ov-file#installation)
- `pip install auto-gptq[triton]` to use the Triton backend

Still encounter an issue? See the [documentation](https://github.com/PanQiWei/AutoGPTQ?tab=readme-ov-file#installation) for up-to-date information.


### AutoAWQ

- Your GPU(s) must be of Compute Capability 7.5. Turing and later architectures are supported.
- Your CUDA version must be CUDA 11.8 or later.

Still encounter an issue? See the [documentation](https://github.com/casper-hansen/AutoAWQ?tab=readme-ov-file#install) for up-to-date information.

If you encounter any problem using Outlines with these libraries, take a look at their installation instructions. The installation of `openai` and `transformers` should be straightforward, but other libraries have specific hardware requirements.

## Installing for development

Expand Down
2 changes: 1 addition & 1 deletion docs/welcome.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Outlines〰 is a Python library that allows you to use Large Language Model in a

## What models do you support?

We support Openai, but the true power of Outlines〰 is unleashed with Open Source models available via the Transformers, AutoAWQ and AutoGPTQ libraries. If you want to build and maintain an integration with another library, [get in touch][discord].
We support Openai, but the true power of Outlines〰 is unleashed with Open Source models available via the Transformers, llama.cpp, exllama2 and mamba_ssm libraries. If you want to build and maintain an integration with another library, [get in touch][discord].

## What are the main features?

Expand Down
2 changes: 0 additions & 2 deletions outlines/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,8 @@
codebase.
"""
from .awq import awq
from .azure import AzureOpenAI, azure_openai
from .exllamav2 import exl2
from .gptq import gptq
from .llamacpp import LlamaCpp, llamacpp
from .mamba import Mamba, mamba
from .openai import OpenAI, openai
Expand Down
45 changes: 0 additions & 45 deletions outlines/models/awq.py

This file was deleted.

25 changes: 0 additions & 25 deletions outlines/models/gptq.py

This file was deleted.

2 changes: 0 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -96,8 +96,6 @@ exclude=["examples"]

[[tool.mypy.overrides]]
module = [
"awq.*",
"auto_gptq.*",
"exllamav2.*",
"jinja2",
"joblib.*",
Expand Down

0 comments on commit 4a76ea2

Please sign in to comment.