-
Notifications
You must be signed in to change notification settings - Fork 569
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
3 changed files
with
52 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# Outlines blog | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
--- | ||
draft: true | ||
date: 2024-01-10 | ||
categories: | ||
- Roadmap | ||
--- | ||
|
||
# Roadmap for 2024 | ||
|
||
Outlines is not even one year old and it's already gone a long way! As we just reached 4000k stars, and before laying out the roadmap for the following year, we would like to pause and thank all of you for supporting us, using and contributing to the library! | ||
|
||
Outlines, as a library, currently differentiates itself from other libraries with its efficient JSON- and regex- constrained generation. Grammar-contrained generation was also recently added. But there is much more we can do along these lines. In 2024 will we will keep pushing in the direction of more accurate, faster constrained generation. | ||
|
||
Outlines also supports many models provides: `transformers`, `autoawq`, `autogptq`, `mamba`, `llama.cpp` and `exllama2`. Those integration represent a lot of maintenance, and we will need to simplify them. For instance, it seems that ``transformers` allows to run quantized models directly, and we may soon deprecate the support for `autoawq` and `autogptq`. Thanks to a refactor of the library, it is now possible to use our constrained generation method by using logits processor with all other libraries, except `mamba`. We will look for libraries that provide state-space model and allow to pass a logits processor during inference. We will interface with `llama.cpp` and `exllama2` using logits processors. | ||
|
||
We want to expand from constrained generation to adding more sampling methods, and therefore expand our work to the whole sampling layer. This means we will keep the `transformers` library integrated as is and will expand our text generation logic around this library. | ||
|
||
Finally, we want to add a CLI tool, `outlines serve`. This will allows you to either serve an API that does general constrained generation, or to serve Outlines function. We are very excited about Outlines functions. Sharing workflows is hard. Etc | ||
|
||
## TL;DR | ||
|
||
### Many more examples and tutorials | ||
|
||
* Tutorials about how Outlines works | ||
* What can you do with Outlines that is harder or impossible to do with other libraries? | ||
* How you can perform standard LLM workflows, for instance Chain of Thoughts, Tree of Throughts, etc. | ||
* Show how it integrates with other libraries like LangChain and LlamaIndex | ||
|
||
### Simplify the integrations | ||
|
||
* Deprecate every integration we can: it seems that the inference part of autoawq and autogptq has now been integrated into transformers. | ||
* Integrate via logits processors as much as we can: | ||
* See if we can integrate via a logits processor to a library that provides state-space models; | ||
* Integrate with llama.cpp via a logits processor | ||
* Integrate with exllamav2 via a logits processor | ||
|
||
Implement the outlines serve CLI that allows to serve Outlines functions and just APIs. Possibly using the integration done on the downstream libraries. | ||
|
||
## Improving the generation layer | ||
|
||
* Use they private API to prepare inputs for generation inside the Transformers class. | ||
* Differentiate by supporting the succession of model generation and text infilling for methods like Beam Search. | ||
* Differentiate by adding new caching methods: attention sink, trie-based caching, etc. | ||
* Differentiate by implementing SMC. | ||
* Add token healing | ||
|
||
## We need your help! | ||
|
||
Outlines is a [community](https://discord.gg/ZxBxyWmW5n) effort |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters