RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc) #1

furqon3009 · 2023-02-17T05:13:38Z

Hi,

I already follow the setup instruction but I had an error when trying to run this code:
python -m experiments.ppo_minigrid_lifelong --algo comp-ppo --learning-rate 1e-3 --steps-per-proc 256 --batch-size 64 --procs 16 --num-tasks 64 --num-steps 1000000 --max-modules 4

This is the error that I get:

Traceback (most recent call last):
File "/home/user/anaconda3/envs/modcrl/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/user/anaconda3/envs/modcrl/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/user/Reinfocement Learning/modcrl/Mendez2022ModularLifelongRL/Discrete2D/torch-ac-composable/torch_ac_composable/experiments/ppo_minigrid_lifelong.py", line 233, in
eval_episodes=eval_episodes,
File "/home/user/Reinfocement Learning/modcrl/Mendez2022ModularLifelongRL/Discrete2D/torch-ac-composable/torch_ac_composable/algos/agent_wrappers.py", line 467, in train
exps, logs1 = self.agent.collect_experiences(task_id)
File "/home/user/Reinfocement Learning/modcrl/Mendez2022ModularLifelongRL/Discrete2D/torch-ac-composable/torch_ac_composable/algos/base.py", line 180, in collect_experiences
dist, value = self.acmodel(preprocessed_obs, task_id)
File "/home/user/anaconda3/envs/modcrl/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/user/Reinfocement Learning/modcrl/Mendez2022ModularLifelongRL/Discrete2D/torch-ac-composable/torch_ac_composable/models/acmodel_modular_fixed.py", line 156, in forward
x = self.fc(features, task_id, return_bc)
File "/home/user/Reinfocement Learning/modcrl/Mendez2022ModularLifelongRL/Discrete2D/torch-ac-composable/torch_ac_composable/models/acmodel_modular_fixed.py", line 145, in fc
x_actor = self.actor_layersself.agent_dyn_dict[task_id]
File "/home/user/anaconda3/envs/modcrl/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/user/anaconda3/envs/modcrl/lib/python3.6/site-packages/torch/nn/modules/container.py", line 100, in forward
input = module(input)
File "/home/user/anaconda3/envs/modcrl/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/user/anaconda3/envs/modcrl/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 87, in forward
return F.linear(input, self.weight, self.bias)
File "/home/user/anaconda3/envs/modcrl/lib/python3.6/site-packages/torch/nn/functional.py", line 1610, in linear
ret = torch.addmm(bias, input, weight.t())
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)

Is there something that I missed? Thank you.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc) #1

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc) #1

furqon3009 commented Feb 17, 2023

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc) #1

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc) #1

Comments

furqon3009 commented Feb 17, 2023