Releases: hezarai/hezar
0.38.4
:bug: Fix minor bugs
0.38.3
:bookmark: Release v0.38.3
0.38.2
:bookmark: Release v0.38.2
0.38.1
:bookmark: Release v0.38.1
0.38.0
Main changes
- Resuming from checkpoint moved to the
TrainerConfig
(settingresume_from_checkpoint
inTrainer.train()
is now deprecated and raises error) - Resuming from checkpoints now supports inner-loop steps instead of only epochs
- Add data sampler for slicing data loaders (mainly used for training resumption)
- Re-order objects initialization in the Trainer's init function
- Add support for optimizer checkpointing in
Trainer
- Add option to disable preprocess and post processing in
Model.predict()
- Seperate generation config in Whisper's model config with a separate data class
- Drop support for Python 3.9
0.37.0
:bookmark: Release v0.37.0
0.36.1
:heavy_plus_sign: Add permissions to `version-release.yml`
0.36.0
v0.36.0
1. Add gradient accumulation support to the Trainer
Now, you can set gradient_accumulation_steps
(defaults to 1 which is the same as regular training) in the TrainerConfig
to enable this feature. This technique can mimic having larger batch sizes without changing the actual batch size! For example, having batch size of 16 and gradient accumulation steps of 4 equals to having batch size of 64! This can lead to faster convergance.
2. Implement tools for training speech recognition models
In this release we added SpeechRecognitionDataset
, SpeechRecognitionDataCollator
and SpeechRecognitionMetricsHandler
so that you can easily train or finetune a Whisper model. Take a look at this example.
3. Split and refactor in Trainer
for better subclassing
We split the training_step
function of the Trainer
in a way that now it only takes care of forward/backward pass and the optimization step is now moved to its own method called optimization_step
. Also added lr_scheduler_step
for customizing LR scheduling step.
4. Add support for more LR schedulers
5. Other bug fixes and improvements
0.35.1
:bookmark: Release `v0.35.1`
0.35.0
This is a big one! We made a lot of changes and improvements in Hezar.
Improvements
- Add support for
accelerate
for distributed training - Add resume from checkpoint feature to Trainer
- Improve saving/logging capabilities in Trainer
- Improve
print_info()
- Add
ImageCaptioningDataset
andImageCaptioningDataCollator
- Enhance padding in tokenizers
- Rewrite contribution docs
- Add tests workflow to actions
- Add
cache_dir
parameter to allload()
methods - Improve
OCRDataset
and bug fixes - Add training scripts for image captioning
- Add training script for CRNN training
- Clean
registry.py
- Change license from MIT to Apache 2.0
- Some improvements and bug fixes in
ViTRobertaImage2Text
- Bug fixes in tests
- Safe class var handling in configs
- Add
return_scores
toCRNNImage2Text
- Add
get_state_dict_from_hub
to support loading from any (non-Hezar) model on the Hub - Set default LR scheduler (reduce on plateau) to
Trainer
Bug fixes
- Fix image captioning decoding bug
- Fix mixed precision bug on CPU
- Fix embedding config bug
Deletions
- Delete empty models modules
- Remove all
Union
annotations and replace with|