-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some minor bugs inside Hifi-Codec code #7
Comments
@rishikksh20 I guess similar issues also exist for the other shared codecs? |
Thanks for your help, I clean some code before I push to the github, so some errors may exist. |
As this repo is still a work in progress having some minor bugs are understandable, my focus currently on HiFi-Codec as I am testing that. AcademiCodec/Encodec_16k_320/test.py Line 119 in 3ee7baf
in line 119 and 121 , it should be
as AcademiCodec/Encodec_16k_320/test.py Line 108 in 3ee7baf
the Soundstream assigne to variable model not soundstream .
|
As I am analyzing new HiFi-codec code I encountered three small bugs:
Here :
AcademiCodec/HiFi-Codec/train.py
Line 31 in 3ee7baf
MelSpectrogram
not imported before use :Modules
not present inside HiFi-Codec:Here :
AcademiCodec/HiFi-Codec/msstftd.py
Line 16 in 3ee7baf
modules
not present inside HiFi-Codec folder. So, neede to copy or change modules reference from other model's modules implementation.Shape of input tensor x, here :
AcademiCodec/HiFi-Codec/vqvae.py
Line 33 in 3ee7baf
While my testing with
24 khz mono channel wav
shape ofx
before line 33 comes out ->[Batch, Samples, 1]
and after.unsqueeze(1)
operation at line 33 it becomes[batch, 1, samples, 1]
a 4D tensor which supposed to be 3D tensor. So shape of x needed to be check before line 33 and if it has 3 dimensions and last dimension is 1 then we needed to squeeze last dimension.After modifying and correct the shape of
x
, code is working fine without an error, and I am able to get desired output.Thanks @yangdongchao .
The text was updated successfully, but these errors were encountered: