Releases · hezarai/hezar

Resuming from checkpoint moved to the TrainerConfig (setting resume_from_checkpoint in Trainer.train() is now deprecated and raises error)
Resuming from checkpoints now supports inner-loop steps instead of only epochs
Add data sampler for slicing data loaders (mainly used for training resumption)
Re-order objects initialization in the Trainer's init function
Add support for optimizer checkpointing in Trainer
Add option to disable preprocess and post processing in Model.predict()
Seperate generation config in Whisper's model config with a separate data class
Drop support for Python 3.9

Assets 2

15 May 19:20

github-actions

0.37.0

2223267

0.37.0

:bookmark: Release v0.37.0

Assets 2

25 Apr 07:34

github-actions

0.36.1

a30feff

0.36.1

:heavy_plus_sign: Add permissions to `version-release.yml`

Assets 2

08 Feb 09:18

github-actions

0.36.0

c6eda29

0.36.0

v0.36.0

1. Add gradient accumulation support to the `Trainer`

Now, you can set gradient_accumulation_steps (defaults to 1 which is the same as regular training) in the TrainerConfig to enable this feature. This technique can mimic having larger batch sizes without changing the actual batch size! For example, having batch size of 16 and gradient accumulation steps of 4 equals to having batch size of 64! This can lead to faster convergance.

2. Implement tools for training speech recognition models

In this release we added SpeechRecognitionDataset, SpeechRecognitionDataCollator and SpeechRecognitionMetricsHandler so that you can easily train or finetune a Whisper model. Take a look at this example.

3. Split and refactor in `Trainer` for better subclassing

We split the training_step function of the Trainer in a way that now it only takes care of forward/backward pass and the optimization step is now moved to its own method called optimization_step. Also added lr_scheduler_step for customizing LR scheduling step.

4. Add support for more LR schedulers

5. Other bug fixes and improvements

Assets 2

02 Jan 07:12

github-actions

0.35.1

613fd34

0.35.1

:bookmark: Release `v0.35.1`

Assets 2

22 Dec 09:32

github-actions

0.35.0

3202efc

0.35.0

This is a big one! We made a lot of changes and improvements in Hezar.

Improvements

Add support for accelerate for distributed training
Add resume from checkpoint feature to Trainer
Improve saving/logging capabilities in Trainer
Improve print_info()
Add ImageCaptioningDataset and ImageCaptioningDataCollator
Enhance padding in tokenizers
Rewrite contribution docs
Add tests workflow to actions
Add cache_dir parameter to all load() methods
Improve OCRDataset and bug fixes
Add training scripts for image captioning
Add training script for CRNN training
Clean registry.py
Change license from MIT to Apache 2.0
Some improvements and bug fixes in ViTRobertaImage2Text
Bug fixes in tests
Safe class var handling in configs
Add return_scores to CRNNImage2Text
Add get_state_dict_from_hub to support loading from any (non-Hezar) model on the Hub
Set default LR scheduler (reduce on plateau) to Trainer

Bug fixes

Fix image captioning decoding bug
Fix mixed precision bug on CPU
Fix embedding config bug

Deletions

Delete empty models modules
Remove all Union annotations and replace with |

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Main changes

v0.36.0

1. Add gradient accumulation support to the `Trainer`

2. Implement tools for training speech recognition models

3. Split and refactor in `Trainer` for better subclassing

4. Add support for more LR schedulers

5. Other bug fixes and improvements

Improvements

Bug fixes

Deletions

Releases: hezarai/hezar

0.38.4

0.38.3

0.38.2

0.38.1

0.38.0

Main changes

0.37.0

0.36.1

0.36.0

v0.36.0

1. Add gradient accumulation support to the Trainer

2. Implement tools for training speech recognition models

3. Split and refactor in Trainer for better subclassing

4. Add support for more LR schedulers

5. Other bug fixes and improvements

0.35.1

0.35.0

Improvements

Bug fixes

Deletions

1. Add gradient accumulation support to the `Trainer`

3. Split and refactor in `Trainer` for better subclassing