Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Retrace and a QNetwork abstraction #615

Closed
wants to merge 4 commits into from
Closed

Add Retrace and a QNetwork abstraction #615

wants to merge 4 commits into from

Conversation

HenriDeh
Copy link
Member

@HenriDeh HenriDeh commented Apr 5, 2022

Following our discussion of yesterday at #613, I'm creating this draft PR to show how I went on implementing the Retrace Algorithm. More precisely, I wanted to implement Retrace as a plug-in that can optionally be used by an algorithm (I have #604 in mind mainly) but such a level of abstraction was not implemented yet. So here's my attempt. I assume that this may not be what you have in mind @findmyway but it can be a useful discussion even if this is never merged.

The main idea is that Retrace is just a different way than TD(n) to compute update targets for a QNetwork. So I created a QNetwork abstraction that can be called to be updated given a batch of action, states and targets : update!(qnetwork::AbstractQNetwork, states, actions, targets).
The main goal here was the introduction of the function q_targets, called by update!.

Now, why did I do this? Because then I could implement retrace in way that is reusable by ab algorithm that uses the QNetwork abstraction. RetraceTrajectory is an extension of a Trajectory, using a new type allows to overload q_targets. So an algorithm that uses a RetraceTrajectory will automatically use this target to update its AbstractQNetwork.

I'll leave this PR as a draft and only use it locally for now. When we move to 0.11 I'll adapt to make Retrace work with it.

@harwiltz
Copy link
Contributor

harwiltz commented Apr 5, 2022

I like where this is going. Would it make sense to make something like QNetworkWithTarget <: AbstractQNetwork to abstract away the target network and soft updates (not necessarily in this PR, just in general)? Also, can you please add a reference to the Retrace paper in a comment somewhere? I believe I read this paper a while ago, but it would be nice to have a reference for convenience.

@HenriDeh
Copy link
Member Author

HenriDeh commented Apr 5, 2022

Would it make sense to make something like QNetworkWithTarget <: AbstractQNetwork to abstract away the target network and soft updates

Yes, I think it's a choice that would make sense. Now that I think of it, it would make more sense to go this way and even define two QNetworkWithTarget, one that uses polyak averaging and one that copies the weights every k updates, since both approaches are common.

For the ref, you can find the paper here.
Vtrace and Impala would be nice additions in the future (https://arxiv.org/abs/1802.01561) too. These pertain a lot to distributed RL that I think is about to undergo major changes in this package soon.

@findmyway
Copy link
Member

Thanks! This is very helpful!

@HenriDeh HenriDeh closed this by deleting the head repository Mar 15, 2023
@HenriDeh HenriDeh mentioned this pull request Aug 11, 2023
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants