DeepSpeed MII v0.0.5

jeffra released this 22 Aug 17:26

· 114 commits to main since this release

0fab33c

What's Changed

Tests use transformers cache for model storage by @mrwyattii in #122
Decouple conversions for gRPC from server/client code by @tohtana in #138
Separate server and client by @tohtana in #142
Add max_tokens option to mii_config by @mallorbc in #129
Fix for CPU device error by @mrwyattii in #148
Load balancing and multiple replicas by @tohtana in #147
RESTful API support by @tohtana in #154
Add Apache 2.0 License by @mrwyattii in #165
Fix condition to terminate RESTful API gateway by @tohtana in #175
Add lock to serialize pipeline execution by @tohtana in #176
Add session to enable multi-turn conversation by @tohtana in #177
Update CI by @mrwyattii in #196
Fix hostfile generation for replicas by @tohtana in #192
Fix deployment name in AML examples by @novaturient95 in #193
Refactored all grpc methods in method_table by @TosinSeg in #202
Add Non-persistent deployment type by @TosinSeg in #197
add llama and update readme counts by @jeffra in #206
Generalize meta tensor pipeline by @mrwyattii in #199
Always enable load balancing by @TosinSeg in #205
Improve unit tests by @mrwyattii in #209
Adding trust_remote_code support by @msinha251 in #203
Update AML Deployment by @mrwyattii in #211

New Contributors

@tohtana made their first contribution in #138
@mallorbc made their first contribution in #129
@novaturient95 made their first contribution in #193
@TosinSeg made their first contribution in #202
@msinha251 made their first contribution in #203

Full Changelog: v0.0.4...v0.0.5

Contributors

jeffra, mrwyattii, and 5 other contributors

Assets 2