Release 0.36.0 · hezarai/hezar

v0.36.0

1. Add gradient accumulation support to the `Trainer`

Now, you can set gradient_accumulation_steps (defaults to 1 which is the same as regular training) in the TrainerConfig to enable this feature. This technique can mimic having larger batch sizes without changing the actual batch size! For example, having batch size of 16 and gradient accumulation steps of 4 equals to having batch size of 64! This can lead to faster convergance.

2. Implement tools for training speech recognition models

In this release we added SpeechRecognitionDataset, SpeechRecognitionDataCollator and SpeechRecognitionMetricsHandler so that you can easily train or finetune a Whisper model. Take a look at this example.

3. Split and refactor in `Trainer` for better subclassing

We split the training_step function of the Trainer in a way that now it only takes care of forward/backward pass and the optimization step is now moved to its own method called optimization_step. Also added lr_scheduler_step for customizing LR scheduling step.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.36.0

v0.36.0

1. Add gradient accumulation support to the `Trainer`

2. Implement tools for training speech recognition models

3. Split and refactor in `Trainer` for better subclassing

4. Add support for more LR schedulers

5. Other bug fixes and improvements

0.36.0

v0.36.0

1. Add gradient accumulation support to the Trainer

2. Implement tools for training speech recognition models

3. Split and refactor in Trainer for better subclassing

4. Add support for more LR schedulers

5. Other bug fixes and improvements

1. Add gradient accumulation support to the `Trainer`

3. Split and refactor in `Trainer` for better subclassing