Remove the AutoAWQ and AutoGPTQ integrations

These were made obsolete by their integration in `transformers`.
dottxt-ai · Jan 11, 2024 · 4a76ea2 · 4a76ea2
1 parent 076bd98
commit 4a76ea2
Show file tree

Hide file tree

Showing 9 changed files with 8 additions and 100 deletions.
diff --git a/README.md b/README.md
@@ -24,7 +24,7 @@ First time here? Go to our [setup guide](https://outlines-dev.github.io/outlines
 
 ## Features
 
-- [x] 🤖 [Multiple model integrations](https://outlines-dev.github.io/outlines/installation): OpenAI, transformers, AutoGPTQ, AutoAWQ
+- [x] 🤖 [Multiple model integrations](https://outlines-dev.github.io/outlines/installation): OpenAI, transformers, llama.cpp, exllama2, mamba
 - [x] 🖍️ Simple and powerful prompting primitives based on the [Jinja templating engine](https://jinja.palletsprojects.com/)
 - [x] 🚄 [Multiple choices](#multiple-choices), [type constraints](#type-constraint) and dynamic stopping
 - [x] ⚡ Fast [regex-guided generation](#efficient-regex-guided-generation)

diff --git a/docs/blog/posts/roadmap-2024.md b/docs/blog/posts/roadmap-2024.md
@@ -18,7 +18,7 @@ Before delving into [the detailed roadmap](#detailed-roadmap), let me share a fe
 
 *Outlines currently differentiates itself* from other libraries with its efficient JSON- and regex- constrained generation. A user-facing interface for grammar-guided generation (it had been hidden in the repository) was also recently added. But there is much more we can do along these lines. In 2024 will we will keep pushing in the direction of more accurate, faster constrained generation.
 
-Outlines also supports many models providers: `transformers`, `autoawq`, `autogptq`, `mamba`, `llama.cpp` and `exllama2`. Those *integrations represent a lot of maintenance*, and we will need to simplify them. For instance, `transformers` now supports quantized models, and we will soon deprecate the support for `autoawq` and `autogptq`.
+Outlines also supports many models providers: `transformers`, `mamba`, `llama.cpp` and `exllama2`. Those *integrations represent a lot of maintenance*, and we will need to simplify them. For instance, `transformers` now supports quantized models, and we will soon deprecate the support for `autoawq` and `autogptq`.
 Thanks to a refactor of the library, it is now possible to use our constrained generation method by using logits processor with all other libraries, except `mamba`. We will look for libraries that provide state-space models and allow to pass a logits processor during inference. We will interface with `llama.cpp` and `exllama2` using logits processors.
 
 *We would like expand our work to the whole sampling layer*, and add new sampling methods that should make guided generation more accurate. This means we will keep the `transformers` integration as it is today and will expand our text generation logic around this library.
@@ -44,7 +44,7 @@ Let's be honest, Outlines is lacking clear and thorough examples. We want to cha
 
 We want to keep the current integrations but lower the maintenance cost so we can focus on what we bring to the table.
 
-* Deprecate every obsolete integration: `transformers` has recently integrated `autoawq` and `autogptq` for instance.
+- [x] Deprecate every obsolete integration: `transformers` has recently integrated `autoawq` and `autogptq` for instance. ([PR]())
 * Integrate via logits processors as much as we can:
   * See if we can integrate to a library that provides state-space models via a logit processing function;
   * Integrate with llama.cpp via a logits processor;

diff --git a/docs/cookbook/chain_of_density.md b/docs/cookbook/chain_of_density.md
@@ -86,7 +86,7 @@ class Summaries(BaseModel):
 We now generate the prompt by passing the article we want to summarize to the template. We load a quantized version of Mistral-7B using the AutoAWQ library, and then use JSON-guided generation to generate the summaries:
 
 ```python
-model = outlines.models.awq("TheBloke/Mistral-7B-OpenOrca-AWQ")
+model = outlines.models.transformers("TheBloke/Mistral-7B-OpenOrca-AWQ")
 
 prompt = chain_of_density(article)
 result = outlines.generate.json(model, Summaries)(prompt)

diff --git a/docs/installation.md b/docs/installation.md
@@ -10,34 +10,16 @@ You can install Outlines with `pip`:
 pip install outlines
 ```
 
-Outlines supports OpenAI, transformers, Mamba, AutoGPTQ, AutoAWQ but **you will need to install them manually**:
+Outlines supports OpenAI, transformers, Mamba, llama.cpp and exllama2 but **you will need to install them manually**:
 
 ```python
 pip install openai
 pip install transformers datasets accelerate
-pip install autoawq
-pip install auto-gptq
+pip install llama-cpp-python
 pip install mamba_ssm
 ```
 
-If you encounter any problem using Outlines with these libraries, take a look at their installation instructions. The installation of `openai` and `transformers` should be straightforward, but other libraries have specific hardware requirements. We summarize them below:
-
-### AutoGPTQ
-
-- `pip install auto-gptq` works with CUDA 12.1
-- For CUDA 11.8, see the [documentation](https://github.com/PanQiWei/AutoGPTQ?tab=readme-ov-file#installation)
-- `pip install auto-gptq[triton]` to use the Triton backend
-
-Still encounter an issue? See the [documentation](https://github.com/PanQiWei/AutoGPTQ?tab=readme-ov-file#installation) for up-to-date information.
-
-
-### AutoAWQ
-
-- Your GPU(s) must be of Compute Capability 7.5. Turing and later architectures are supported.
-- Your CUDA version must be CUDA 11.8 or later.
-
-Still encounter an issue? See the [documentation](https://github.com/casper-hansen/AutoAWQ?tab=readme-ov-file#install) for up-to-date information.
-
+If you encounter any problem using Outlines with these libraries, take a look at their installation instructions. The installation of `openai` and `transformers` should be straightforward, but other libraries have specific hardware requirements.
 
 ## Installing for development
 

diff --git a/docs/welcome.md b/docs/welcome.md
@@ -6,7 +6,7 @@ Outlines〰 is a Python library that allows you to use Large Language Model in a
 
 ## What models do you support?
 
-We support Openai, but the true power of Outlines〰 is unleashed with Open Source models available via the Transformers, AutoAWQ and AutoGPTQ libraries. If you want to build and maintain an integration with another library, [get in touch][discord].
+We support Openai, but the true power of Outlines〰 is unleashed with Open Source models available via the Transformers, llama.cpp, exllama2 and mamba_ssm libraries. If you want to build and maintain an integration with another library, [get in touch][discord].
 
 ## What are the main features?
 

diff --git a/outlines/models/__init__.py b/outlines/models/__init__.py
@@ -5,10 +5,8 @@
 codebase.
 
 """
-from .awq import awq
 from .azure import AzureOpenAI, azure_openai
 from .exllamav2 import exl2
-from .gptq import gptq
 from .llamacpp import LlamaCpp, llamacpp
 from .mamba import Mamba, mamba
 from .openai import OpenAI, openai

diff --git a/outlines/models/awq.py b/outlines/models/awq.py
diff --git a/outlines/models/gptq.py b/outlines/models/gptq.py
diff --git a/pyproject.toml b/pyproject.toml
@@ -96,8 +96,6 @@ exclude=["examples"]
 
 [[tool.mypy.overrides]]
 module = [
-    "awq.*",
-    "auto_gptq.*",
     "exllamav2.*",
     "jinja2",
     "joblib.*",