DeepSpeed MII v0.0.5
What's Changed
- Tests use transformers cache for model storage by @mrwyattii in #122
- Decouple conversions for gRPC from server/client code by @tohtana in #138
- Separate server and client by @tohtana in #142
- Add max_tokens option to mii_config by @mallorbc in #129
- Fix for CPU device error by @mrwyattii in #148
- Load balancing and multiple replicas by @tohtana in #147
- RESTful API support by @tohtana in #154
- Add Apache 2.0 License by @mrwyattii in #165
- Fix condition to terminate RESTful API gateway by @tohtana in #175
- Add lock to serialize pipeline execution by @tohtana in #176
- Add session to enable multi-turn conversation by @tohtana in #177
- Update CI by @mrwyattii in #196
- Fix hostfile generation for replicas by @tohtana in #192
- Fix deployment name in AML examples by @novaturient95 in #193
- Refactored all grpc methods in method_table by @TosinSeg in #202
- Add Non-persistent deployment type by @TosinSeg in #197
- add llama and update readme counts by @jeffra in #206
- Generalize meta tensor pipeline by @mrwyattii in #199
- Always enable load balancing by @TosinSeg in #205
- Improve unit tests by @mrwyattii in #209
- Adding trust_remote_code support by @msinha251 in #203
- Update AML Deployment by @mrwyattii in #211
New Contributors
- @tohtana made their first contribution in #138
- @mallorbc made their first contribution in #129
- @novaturient95 made their first contribution in #193
- @TosinSeg made their first contribution in #202
- @msinha251 made their first contribution in #203
Full Changelog: v0.0.4...v0.0.5