Thesis Vanderlande

This repository contains all the code necessary to reproduce the RL agent and the simulation used during training.

To run, setup a virtual environment with:

Python 3.6.8
packages available in Requirements.txt

(install with: $pip install -r requirements.txt )

Folders in repository

evaluation_results
Folder with the evaluations of the DRL agent and the Heuristic Agent on the Idle time and Recirculation of carriers.
figures
All the generated figures also used in the report.
Notebooks
Jupyter notebooks used for generating intermediate results, for testing the Heuristic Agent and for thinkering with the environment.
Renders
Folder with generated renders of the simulation for visual inspection.
rl
All the Reinforcement Learning script, including the environments, the configuration, helper scripts, results of the hyper_parameter optimization with Baysian Optimization and a Folder with the relevant trained models.

Explaination of scripts in repository

check_env.py
A script to check if an environment in rl/environements adheres to the format of the OpenAI Gym framework. Run this script with:
python check_env.py -e [ENVIRONMENT]
check_performance.py
Runs a test on a given environment, has possibility to render. Usage is:
python check_performance.py -e [ENVIRONMENT_NAME] -s [SUBDIR] -n [RUNNR]
expert_trajectories.py
Script to generate expert trajectories on an given environment. The Heuristic agent is taken as the expert to generate trajectories. The number of episodes to generate can be given as an input parameter. Run as: python expert_trajectories.py -e [ENVIRONMENT_NAME] -s [SUBDIR] -c [CONFIG] -n [NUMEPISODES]
hyperparameter.py
Script that uses the hyperdrive package to do bayesian optimization on the weights of the reward shaping function. Can be run by:
python hyperparameter.py -e [ENVIRONMENT_NAME]
plotcombiner_x.py
Script to generate and combine plots for the rewards and the loss subtracted from the tensorboard logs. Figures are stored in /figures/.
plotmaker.py
Script to make a reward and loss plot for a specific result in the trained_models folder. Usage of this script:
python plotmaker.py -e [ENVIRONMENT_NAME] -s [SUBDIR]
pretrain.py
Script that is used to do the pre-training of an PPO agent with behavioral cloning, based on the data generated with the expert_trajectories.py script. Retraining after the pretraining phase is also possible with the -r argument. Run with:
python pretrain.py -e [ENVIRONMENT_NAME] -s [SUBDIR] -n [LOGNAME] -c [CONFIG] -t [TENSORBOARD] -r [RETRAIN]
resultmaker.py
Script to generate the results on the objectives performance for the trained DRL agents. Runs 100 episodes and averages the results over these 100 episodes to obtain mean performance. Results are stored in /evaluation_results/. Can be run by:
python resultmaker_x.py -t [TERM] -n [NUM_EPISODES]
retrain_callback.py
Function to retrain a model in a specific subfolder in the /trained_models/ directory. Has an evaluation callback during training to evalute the model every 10 000 steps. Can be run by:
python retrain_callback.py -e [ENVIRONMENT_NAME] -s [SUBDIR] -n [LOGNAME] -c [CONFIG] -t [TENSORBOARD]
test_all.py
Script that reads all the models from a subdir in trained_models and tests the model with possibility to render the output. The action can be done deterministic or stochastic, depending of parameter -d. Can be run by:
python test_all.py -e [ENVIRONMENT_NAME] -s [SUBDIR] -r -d
train_callback.py
Script to train an agent on a given environment. Uses a callback to store the best model, the model is evaluated each 10 000 steps. The model is also saved in a pre-defined interval. Can be run with:
python train_callback.py -e [ENVIRONMENT_NAME] -s [SUBDIR] -n [LOGNAME] -c [CONFIG] -t [TENSORBOARD]
train_on.py
Script to train the agent, with two callbacks; one for saving the best model and one for terminating the training when the evaluation result is over a threshold of 95% of the maximum reward that can be obtained. Run by:
python train_on.py -e [ENVIRONMENT_NAME] -s [SUBDIR] -n [LOGNAME] -c [CONFIG] -t [TENSORBOARD]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Thesis Vanderlande

Folders in repository

Explaination of scripts in repository

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
Notebooks		Notebooks
Renders		Renders
evaluation_results		evaluation_results
figures		figures
rl		rl
.gitignore		.gitignore
Readme.md		Readme.md
check_env.py		check_env.py
check_performance.py		check_performance.py
expert_trajectories.py		expert_trajectories.py
hyperparameter.py		hyperparameter.py
plotcombiner_d.py		plotcombiner_d.py
plotcombiner_dimi.py		plotcombiner_dimi.py
plotcombiner_p.py		plotcombiner_p.py
plotmaker.py		plotmaker.py
pretrain.py		pretrain.py
requirements.txt		requirements.txt
resultmaker_d.py		resultmaker_d.py
resultmaker_dimi.py		resultmaker_dimi.py
resultmaker_p.py		resultmaker_p.py
retrain_callback.py		retrain_callback.py
run_schedule.bat		run_schedule.bat
test_all.py		test_all.py
train_callback.py		train_callback.py
train_on.py		train_on.py

VincentFokker/Thesis

Folders and files

Latest commit

History

Repository files navigation

Thesis Vanderlande

Folders in repository

Explaination of scripts in repository

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages