This repo contains code accompaning the manuscript: Zhi Wang, Chunlin Chen, and Daoyi Dong, "Lifelong incremental reinforcement learning with online Bayesian inference", IEEE Transactions on Neural Networks and Learning Systems, 2021. It contains code for running the incremental learning domain tasks, including 2D navigation, Hopper, HalfCheetah, and Ant domains.
This code requires the following:
- python 3.5+
- pytorch 1.0+
- gym
- MuJoCo license
- For the 2D navigation domains, data is generated from
myrllib/envs/navigation.py
- For the Hopper/HalfCheetah/Ant Mujoco domains, the modified Mujoco enviornments are in
myrllib/envs/mujoco/*
- For example, to run the code in the navi_v1 domain where the dynamic environment is contructed by changing the goal points, just run the bash script
navi_v1_llirl.sh
for LLIRL, and run the bash scriptnavi_v1_baselines.sh
for the baseline approaches including CA, Robust, Adaptive, and MAML. Also see the usage instructions in the python scriptsenv_clustering.py
,policy_training.py
, andbaselines.py
. - The task information is saved in
saves/*/task_info.npy
files. For visualization of the clustering results, plot the task information using the 'task_clustering' function indata_process.py
. - The performance of all methods is recorded in
output/*/*.npy
files. For visualization of performance comparison, plot the learning curves using the 'perforamnce_comparison' function indata_process.py
.
- The demo experimental results for the navi_vi domain are showed as
task clustering | performance comparison |
---|---|
![]() |
![]() |
To ask questions or report issues, please open an issue on the issues tracker, or email to [email protected].