v1.0.1
Notable changes:
- More GPTQ support
- Rope scaling (linear + dynamic)
- Bitsandbytes 4bits (both modes)
- Added more documentation
What's Changed
- Local gptq support. by @Narsil in #738
- Fix typing in
Model.generate_token
by @jaywonchung in #733 - Adding Rope scaling. by @Narsil in #741
- chore: fix typo in mpt_modeling.py by @eltociear in #737
- fix(server): Failing quantize config after local read. by @Narsil in #743
- Typo fix. by @Narsil in #746
- fix typo for dynamic rotary by @flozi00 in #745
- add FastLinear import by @zspo in #750
- Automatically map deduplicated safetensors weights to their original values (#501) by @Narsil in #761
- feat(server): Add native support for PEFT Lora models by @Narsil in #762
- This should prevent the PyTorch overriding. by @Narsil in #767
- fix build tokenizer in quantize and remove duplicate import by @zspo in #768
- Merge BNB 4bit. by @Narsil in #770
- Fix dynamic rope. by @Narsil in #783
- Fixing non 4bits quantization. by @Narsil in #785
- Update init.py by @Narsil in #794
- Llama change. by @Narsil in #793
- Setup for doc-builder and docs for TGI by @merveenoyan in #740
- Use destructuring in router arguments to avoid '.0' by @ivarflakstad in #798
- Fix gated docs by @osanseviero in #805
- Minor docs style fixes by @osanseviero in #806
- Added CLI docs and rename docker launch by @merveenoyan in #799
- [docs] Build docs only when doc files change by @mishig25 in #812
- Added ChatUI Screenshot to Docs by @merveenoyan in #823
- Upgrade transformers (fix protobuf==3.20 issue) by @Narsil in #795
- Added streaming for InferenceClient by @merveenoyan in #821
- Version 1.0.1 by @Narsil in #836
New Contributors
- @jaywonchung made their first contribution in #733
- @eltociear made their first contribution in #737
- @flozi00 made their first contribution in #745
- @zspo made their first contribution in #750
- @ivarflakstad made their first contribution in #798
- @osanseviero made their first contribution in #805
- @mishig25 made their first contribution in #812
Full Changelog: v1.0.0...v1.0.1