Observation Sensitive MCTS for Elevator Transportation

We present a modification to the Monte-Carlo-TreeSearch (MCTS) approach used in AlphaZero, which incorporates observed rewards in every step. Thereby we extend the applicability of the AlphaZero algorithm to tasks with observed rewards at every step that feed directly into the final reward. We apply this algorithm to one representative of this class, the elevator transportation task, and show that our method is able to train successfully on this task. Our method reaches performance close to the collective-control heuristic.

Project Structure

More information about the Problem can be found in the Final Report
The main training procedure can be found in train.py
Hyper-parameters can be set at config.yaml, use the environment variable "CONFIG_NAME" to choose a configuration
The implementation of the model and learning algorithm can be found in the alphazero folder
A detailed description of the environment we used can be found in the Documentation, the implementation is in the environment folder
In the baseline folder you can find baselines such as random policy, pure MCTS, and the heuristic collective control
If you want to play a little bit and control some elevators yourself you can run the interactive environment

Environment

We have implemented an elevator environment as visualized below, which resembles the real world scenario closely by only observing the passenger requests (up or down at a specific floor), requested floors for elevators and the total number of passengers in an elevator. Unobserved is e.g. how many people are waiting at floors with a passenger request and how many passengers want to leave at requested floors. To avoid those hidden states from leaking through the MCTS exploration we represent them stochastically in our state.

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
doc		doc
src		src
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
Final_Presentation.pdf		Final_Presentation.pdf
Final_Report.pdf		Final_Report.pdf
Makefile		Makefile
Milestone Report.pdf		Milestone Report.pdf
Milestone_Presentation.pdf		Milestone_Presentation.pdf
Proposal.pdf		Proposal.pdf
milestones.md		milestones.md
pyproject.toml		pyproject.toml
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Observation Sensitive MCTS for Elevator Transportation

Project Structure

Environment

About

Releases

Packages

Contributors 2

Languages

MaxRieger96/tum-adlr-ss20-05

Folders and files

Latest commit

History

Repository files navigation

Observation Sensitive MCTS for Elevator Transportation

Project Structure

Environment

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages