KMIFQE

This repository is the official implementation of KMIFQE: Kernel Metric Learning for In-Sample Off-Policy Evaluation (presented at ICLR 2024 as a Spotlight paper).

Install Conda Environment

conda env create --name envname --file=environment.yml

How to Run

Train behavior and target policies:

python train_policy.py \ 
  --env=Hopper-v2

Collect data:

python save_replay_buffer.py \
  --env=Hopper-v2 \
  --policy_idx=[behavior policy file name] \
  --max_timesteps=1000000 \
  --random=0 \ 
  --behav_bias=0 \
  --behav_std=0.3

Train and evaluate KMIFQE:

python main.py \ 
  --env=Hopper-v2 \
  --target_policy_idx=[target policy file name] \
  --behavior_policy_idx=[behavior policy file name] \
  --buffer_size=1000000 \ 
  --behav_bias=0 \
  --behav_std=0.3

Bibtex

If you use this code, please cite our paper:

@inproceedings{lee2024kmifqe,
  title={Kernel Metric Learning for In-Sample Off-Policy Evaluation of Deterministic {RL} Policies},
  author={Haanvid Lee and Tri Wahyu Guntara and Jongmin Lee and Yung-Kyun Noh and Kee-Eung Kim},
  booktitle={The Twelfth International Conference on Learning Representations (ICLR)},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
KMIFQE.py		KMIFQE.py
LICENSE		LICENSE
README.md		README.md
TD3.py		TD3.py
ddpg_replay_buffer.py		ddpg_replay_buffer.py
environment.yml		environment.yml
main.py		main.py
replay_buffer.py		replay_buffer.py
save_replay_buffer.py		save_replay_buffer.py
train_policy.py		train_policy.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KMIFQE

Install Conda Environment

How to Run

Bibtex

About

Releases

Packages

Languages

License

haanvid/kmifqe

Folders and files

Latest commit

History

Repository files navigation

KMIFQE

Install Conda Environment

How to Run

Bibtex

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages