If I want to use the the meta-parameters to adapt to new task, what should I do? #55

GeorgeDUT · 2020-12-16T08:46:55Z

I write a new environment (navigation on deterministic map):
(1) I run " python train.py --config xxxx", and get config.json, policy.th.
(2) I run "python test.py -config xxxx", and get results.npz.
But the rewards in results.npz are still very low.
What should I do to use policy.th to fast adapt to a new task?

tristandeleu · 2020-12-16T18:15:54Z

You should use --policy policy.th in test.py to use your trained policy.
That's surprising that you didn't get any error when running test.py without --policy, since this is a required parameter.

GeorgeDUT · 2020-12-23T08:19:58Z

I get it. I run test.py with policy.th. But the rewards of valid_return are equal to or even lower than train_return.
Maybe, our environment is not suitable. Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

If I want to use the the meta-parameters to adapt to new task, what should I do? #55

If I want to use the the meta-parameters to adapt to new task, what should I do? #55

GeorgeDUT commented Dec 16, 2020

tristandeleu commented Dec 16, 2020

GeorgeDUT commented Dec 23, 2020

If I want to use the the meta-parameters to adapt to new task, what should I do? #55

If I want to use the the meta-parameters to adapt to new task, what should I do? #55

Comments

GeorgeDUT commented Dec 16, 2020

tristandeleu commented Dec 16, 2020

GeorgeDUT commented Dec 23, 2020