-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Process Worker<AsyncVectorEnv>-1: #19
Comments
Hey. Could you include bit more details so we know what could be up, including:
|
Sure, this is my command:
and this is my main modifications in fbcpr_train_humenv.py
this is whole script
my GPU is 4090, and 32 gb ram and my cpu has 24 processors . |
Because currently I don't have a A100 GPU, so I want to make sure this process is right firstly and then use A100 to train . |
Is there anything else in the output command? Multiprocessing errors usually tell that a subprocess crashed (for which there should also be some output), so having the full output log would help here. |
|
Seems like one of the many multiprocessing threads was killed during eval process (the "Killed" print, usually coming from the underlying Python process when it received kill signal). I would check the memory usage and see if the system is running out of RAM, and try it on your A100 machine if possible in case it case-specific to your machine. |
Thanks,I will take a try. |
I have also met the same problem and it was killed during the evaluation. |
@TheKnight-Z How much RAM does your machine have? The "Killed" message is usually from OS killing the process because it runs out of memory. |
Hello,just curious,how much ram do I need? |
I have not checked this in a while (and it can depend on what machine you run exactly). With full settings, I would expect to need at least 32GB, but probably more. |
Thank you , I can take some test about this problem . |
An error during trainning
The text was updated successfully, but these errors were encountered: