-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement "Curious Replay" into Dreamer? #301
Comments
Hi @defrag-bambino, that was something that we have always had in mind to implement but never did. |
I'm in the process of implementing this now. Right now, dreamer_v3 uses a SequentialReplayBuffer. This will have to change. Does the implementation rely on the sampled data being sequential? Or could I use the regular ReplayBuffer class? |
Hi @defrag-bambino. Yes, Dreamer-V3 needs sequential data, so you should modify the SequentialReplayBuffer |
Ok. I've implement something now and it seems to work. Its not very clean yet, e.g. just hardcoded changes rather than something configurable. Here is the fork.
Vanilla DreamerV3 is blue, with Curious Replay orange. On Quadruped-Walk and and Hopper-Hop it performs better than the baseline, exactly as in the paper. Interestingly, on Pendulum-Swingup it does not perform worse, as in the paper, but better. Even more so on Cartpole-Swingup-Sparse, where the baseline fails to achieve any progress. However, this may likely be due to only running one experiment each. Also, these swingup tasks are probably too easy and not what Dreamer is made for. We'll definetly have to test it on more complex envs, such as Crafter. |
This is awesome! I'll try it out on lightning.ai with crafter in the next few days! |
Hi @defrag-bambino, sorry for the laaate response. Have you managed to implement this in the buffers.py? I would create a new buffer class extending from the SequentialReplayBuffer by changing the |
I think so, yeah. I don't quite remember. But its probably not a seperate class, rather just proof-of-concept edited into the SequentialReplayBuffer. You can check it out in my fork. I'll probably get back to this once I finish something else, but that'll be another 2-3 months. |
Hi,
I recently saw the paper Curious Replay for Model-based Adaptation, which proposes a fairly straight forward curiosity-based sampling from the Replay Buffer.
Are there any plans to potentially integrate this into the SheepRL Dreamer implementations?
Thanks
The text was updated successfully, but these errors were encountered: