GitHub - tgangwani/SelfImitationDiverse: Tensorflow code for "Learning Self-Imitating Diverse Policies" (ICLR 2019)

This repo contains code for our paper Learning Self-Imitating Diverse Policies published at ICLR 2019.

The code was tested with the following packages:

python 3.5.2
tensorflow 1.4.0
gym 0.9.2

Running command

To train a self-imitation agent in an episodic reward environment, use:

python main.py --env_id HalfCheetah-v1 --seed=$(echo $RANDOM) --mu=0.8 --episodic

The parameter 'mu' is as defined in the paper (Equation 5.)

SVPG for diverse multi-agent training

This functionality is provided as part of a separate codebase. Please use the code here with the following configuration in the file default_config.yaml: divergence: js, dre_type: nce

Credits

The code is built on, and uses many utils from OpenAI baselines

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
imitation		imitation
.gitignore		.gitignore
README.md		README.md
main.py		main.py
run_single_agent.sh		run_single_agent.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Running command

SVPG for diverse multi-agent training

Credits

About

Releases

Packages

Languages

tgangwani/SelfImitationDiverse

Folders and files

Latest commit

History

Repository files navigation

Running command

SVPG for diverse multi-agent training

Credits

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages