New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

suggest to do #1

Open

zdx3578 opened this issue Dec 29, 2018 · 0 comments

Member

zdx3578 commented Dec 29, 2018

无监督预训练：
Variational Option Discovery Algorithms :: real hierarchical
DIVERSITY IS ALL YOU NEED: 有基于SAC的代码

Model-Ensemble Trust-Region Policy Optimization, Kurutach et al, 2018. Algorithm: ME-TRPO.
有 code

Model-Based Reinforcement Learning via Meta-Policy Optimization, Clavera et al, 2018. Algorithm: MB-MPO.

EMI: EXPLORATION WITH MUTUAL INFORMATION MAXIMIZING STATE AND ACTION EMBEDDINGS

polo exploration-- Randomized Prior Functions for Deep Reinforcement Learning ---relate to RND;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment