Add Retrace and a QNetwork abstraction #615
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Following our discussion of yesterday at #613, I'm creating this draft PR to show how I went on implementing the Retrace Algorithm. More precisely, I wanted to implement Retrace as a plug-in that can optionally be used by an algorithm (I have #604 in mind mainly) but such a level of abstraction was not implemented yet. So here's my attempt. I assume that this may not be what you have in mind @findmyway but it can be a useful discussion even if this is never merged.
The main idea is that Retrace is just a different way than TD(n) to compute update targets for a QNetwork. So I created a QNetwork abstraction that can be called to be updated given a batch of action, states and targets :
update!(qnetwork::AbstractQNetwork, states, actions, targets)
.The main goal here was the introduction of the function
q_targets
, called byupdate!
.Now, why did I do this? Because then I could implement retrace in way that is reusable by ab algorithm that uses the QNetwork abstraction.
RetraceTrajectory
is an extension of aTrajectory
, using a new type allows to overloadq_targets
. So an algorithm that uses aRetraceTrajectory
will automatically use this target to update itsAbstractQNetwork
.I'll leave this PR as a draft and only use it locally for now. When we move to 0.11 I'll adapt to make Retrace work with it.