LifeLong Incremental Reinforcement Learning (LLIRL)

This code requires the following:

For the 2D navigation domains, data is generated from myrllib/envs/navigation.py
For the Hopper/HalfCheetah/Ant Mujoco domains, the modified Mujoco enviornments are in myrllib/envs/mujoco/*

For example, to run the code in the navi_v1 domain where the dynamic environment is contructed by changing the goal points, just run the bash script navi_v1_llirl.sh for LLIRL, and run the bash script navi_v1_baselines.sh for the baseline approaches including CA, Robust, Adaptive, and MAML. Also see the usage instructions in the python scripts env_clustering.py, policy_training.py, and baselines.py.
The task information is saved in saves/*/task_info.npy files. For visualization of the clustering results, plot the task information using the 'task_clustering' function in data_process.py.
The performance of all methods is recorded in output/*/*.npy files. For visualization of performance comparison, plot the learning curves using the 'perforamnce_comparison' function in data_process.py.

task clustering	performance comparison

To ask questions or report issues, please open an issue on the issues tracker, or email to [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
demo		demo
myrllib		myrllib
output		output
saves		saves
.DS_Store		.DS_Store
README.md		README.md
ant_baselines.sh		ant_baselines.sh
ant_llirl.sh		ant_llirl.sh
baselines.py		baselines.py
cheetah_baselines.sh		cheetah_baselines.sh
cheetah_llirl.sh		cheetah_llirl.sh
data_process.py		data_process.py
env_clustering.py		env_clustering.py
hopper_baselines.sh		hopper_baselines.sh
hopper_llirl.sh		hopper_llirl.sh
navi_v1_baselines.sh		navi_v1_baselines.sh
navi_v1_llirl.sh		navi_v1_llirl.sh
navi_v2_baselines.sh		navi_v2_baselines.sh
navi_v2_llirl.sh		navi_v2_llirl.sh
navi_v3_baselines.sh		navi_v3_baselines.sh
navi_v3_llirl.sh		navi_v3_llirl.sh
policy_training.py		policy_training.py

Provide feedback