Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EXAMPLE Lora training and EXAMPLE lora #7

Open
Maelstrom2014 opened this issue Jun 13, 2024 · 8 comments
Open

EXAMPLE Lora training and EXAMPLE lora #7

Maelstrom2014 opened this issue Jun 13, 2024 · 8 comments

Comments

@Maelstrom2014
Copy link

Maelstrom2014 commented Jun 13, 2024

HI!

  1. Please put EXAMPLE Lora training with dataset and EXAMPLE lora using it. Thanx a lot.
  2. Will it train only lora with train.py?
  3. Whats the size of lora and how much epochs to train?
  4. How much GPU VRAM it needs?
@NeuralNotW0rk
Copy link
Owner

Hi there,

Please put EXAMPLE Lora training with dataset and EXAMPLE lora using it. Thanx a lot.

I'm still experimenting a bunch with this myself, but I'll prioritize adding some examples. As for the dataset preparation, I would defer to the examples in stable-audio-tools for the time being. I'll add some links to those.

Will it train only lora with train.py?

train.py has all the same functionality as train.py in stable-audio-tools. Adding the --use-lora argument will instruct it to freeze the base model and train lora weights only.

What's the size of lora and how much epochs to train?

I'm still figuring out the best hyperparameters to use. With Stable Audio Open, a rank 16 lora is ~59 MB and a rank 128 lora is ~472 MB. With a batch size of 32 and sample length of 10s, I am getting good results in several-thousand steps range (a few hours on my setup)

How much GPU VRAM it needs?

I am fine-tuning Stable Audio Open with a batch size of 32 and a sample length of 10 seconds. On a 4070 ti super in Windows 11, it is taking a bit less that 10GB for rank 16 and 13GB for rank 128. It may be more efficient on linux -- I was getting less than 8GB while experimenting with rank 16 on Colab.

@Maelstrom2014
Copy link
Author

Any new about examples?

@benbowler
Copy link

The sticking point in the README is this line:

https://github.com/NeuralNotW0rk/LoRAW/blob/main/README.md?plain=1#L21

I have tried the configs here with the Stable Audio Open 1.0 checkpoint as the starting point with the command:

python ./train.py  --dataset-config ./datasets.json --model-config ./lorawfinetune-config.json --pretrained-ckpt-path ./stable-audio-open-1.0/model.ckpt --use-lora true

But it fails to run entirely in this repo. In the main repo the finetuned model returns noise when using laion_clap checkpoint.

@NeuralNotW0rk
Copy link
Owner

Any new about examples?

I think I can share some audio examples in my next commit. In your original question are you asking for an actual dataset and a lora checkpoint?

@NeuralNotW0rk
Copy link
Owner

The sticking point in the README is this line:

https://github.com/NeuralNotW0rk/LoRAW/blob/main/README.md?plain=1#L21

I have tried the configs here with the Stable Audio Open 1.0 checkpoint as the starting point with the command:

python ./train.py  --dataset-config ./datasets.json --model-config ./lorawfinetune-config.json --pretrained-ckpt-path ./stable-audio-open-1.0/model.ckpt --use-lora true

But it fails to run entirely in this repo. In the main repo the finetuned model returns noise when using laion_clap checkpoint.

The config in https://github.com/NeuralNotW0rk/LoRAW/blob/main/examples/model_config.json is what I use with Stable Audio Open. You should be able to add a "lora" section to the end of any model config. // ... args, model, training, etc. ... was meant as a placeholder to represent the rest of the model_config, but if it is confusing, I could just point the reader to the full example config instead.

@benbowler
Copy link

Ah yes, it wasn't originally clear which config was used to me to train stable audio open but I actually found the model_config.json file on hugging face independently just now!

@GoombaProgrammer
Copy link

What is the maximum length recommended for each wav in the data?

@Maelstrom2014
Copy link
Author

Where I can find trained loras for testing? thax all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants