You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been getting a lot of feedback from users on how long MMvec takes to run. I'll integrate these comments into the README eventually.
Below are some tips on how to speed up training
Try to get GPUs. You're going to get a 10x boost in runtime off the bat. These days, you can use Google colab to get free GPUs (first come first serve, and you may get booted), or you can rent GPUs at AWS. See the --arm-the-gpu flag.
Increase your batch size, as large as you can. I'm talking 100,000 reads at a time or more. The larger it is, the faster your iterations will complete. It'll also reduce the noise in your training. This is particularly true for GPUs, you'll probably have a tight upper cap on how much you can load.
Once you have settled on your batch size, bump up your learning rate (i.e. about 0.1). This will help take larger gradient descent steps, which also speeds up convergence. Having a large batch size will help with this.
Watch your epochs. You probably don't need more than 100 epochs. This can be justified by the cross-validation plots.
Reduce the number of summary intervals. You probably don't need record summaries every second, and having too many summaries will bog down your training time. Recording every minute or so should be fine. Setting --p-summary-interval 60 will record a summary every 60 seconds.
Unfortunately, this type of tuning is on a case-by-case basis, since hardware and datasets comes in all sorts of shapes and sizes. We'll try to make this easier in future iterations.
The text was updated successfully, but these errors were encountered:
I've been getting a lot of feedback from users on how long MMvec takes to run. I'll integrate these comments into the README eventually.
Below are some tips on how to speed up training
--arm-the-gpu
flag.--p-summary-interval 60
will record a summary every 60 seconds.Unfortunately, this type of tuning is on a case-by-case basis, since hardware and datasets comes in all sorts of shapes and sizes. We'll try to make this easier in future iterations.
The text was updated successfully, but these errors were encountered: