Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running training.py results in an error #8

Open
BanTheNons opened this issue Aug 5, 2022 · 1 comment
Open

Running training.py results in an error #8

BanTheNons opened this issue Aug 5, 2022 · 1 comment

Comments

@BanTheNons
Copy link

AsPowerBar_HE2lvkQ6uf
For some reason it's telling me that num_samples is 0 even though there are thousands of samples.

@horenbergerb
Copy link
Owner

horenbergerb commented Sep 21, 2022

Have you verified that

train_data = TextDataset(tokenizer=tokenizer, file_path=train_dir, block_size=block_size, overwrite_cache=False)

properly loaded? The wrong filepath here might cause this issue?

Difficult for me to investigate this with the information available. I'll leave this open for a bit in case the same problem comes up again (or you have new information) and then close it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants