DiscoveryWorld is a top-down, turn-based 2D game containing scenarios that model some aspects of scientific discovery. The purpose is to help benchmark how well AI models can solve tasks that involve discovery scenarios.
The purpose of the user study is to have a control to see how well humans solve these same scenarios.
While we plan to publicly release this game in the next few months, we ask that you do not distribute the experiment materials (including but not limited to the game code for this alpha version) with anyone.
The procedure for the user study is:
- Download and install DiscoveryWorld.
- Play the DiscoveryWorld scenarios.
- Send the play log files, and any notes you have made.
Download DiscoveryWorld (using the link provided to you), and unzip it somewhere you're comfortable with.
Enter the directory:
cd discoveryworld
Create a conda environment:
conda create --name discoveryworld python=3.9
conda activate discoveryworld
Install the dependencies:
pip install -r requirements.txt
pip install -e .
The graphical interface can be run with the following command from the discoveryworld
root:
python scripts/userstudy.py
In order to familiarize yourself with DiscoverWorld, you should complete the first scenario called "Tutorial". In that scenario, you will learn how to move your character around, open containers, manipulate items, interact with devices, and dialog with another character.
Completing the tutorial is mandatory. While playing it, watch out for a completion code that you need to give to the experimenter.
DiscoveryWorld contains a number of scenarios, which are shown on a menu when you run the user study. Currently the scenarios provided below are complete, and the ones we'd like you to work through:
Tutorial
Archaeology Dig -- Normal and Challenge
Combinatorial Chemistry -- Normal and Challenge (note that Challenge may take a while)
Plant Nutrients -- Normal and Challenge
Proteomics -- Normal and Challenge
Reactor Lab -- Normal and Challenge
Lost in Translation -- Normal and Challenge
Space Sick -- Normal and Challenge
It's (not) Rocket Science -- Normal (*Challenge* is not yet ready, and should not be played)
Scenarios not on this list are not yet complete, and should not be played.
Scenarios can be played in any order -- you don't need to start with "Archaeology Dig" first.
For scenarios with both Normal
and Challenge
versions, we would ask that you work through both.
For all scenarios, we ask that you select a task variation of 1
(the first option).
Upon starting a game, it will present the instructions for your task to you. These can always be pulled up again by pressing the TAB key.
Making these discoveries will require methodologically exploring different hypotheses and keeping track of data and your experimental results, just as in real science. You are strongly encouraged to use external tools -- for example, a spreadsheet, and your favorite statistics program.
Please keep a notebook (in a text editor of your choice, as long as it's saved in something we can easily open -- e.g. .txt
, .pdf
, .doc
) where you write down your notes/work in solving each scenario.
Please treat writing this notebook as a mini (and very brief) lab report. We would like to know what data you collected, what hypothesis/hypotheses you tested, what solution you came up with, why you think it's correct, and what model(s) or analyses (mathematical or otherwise) you used to arrive at that solution.
At the start of your notes for a given scenario, please write down the scenario name, difficulty, and variation. For example:
START OF NOTES FOR:
Scenario: Reactor Lab
Difficulty: Normal
Variation: 1
Please include a list of any tools you used:
TOOLS I USED WHEN SOLVING THIS SCENARIO:
Spreadsheet: Excel
Statistics: R
At the end of the notes for a given scenario, it would be helpful if you clearly included your hypothesis for the solution to the task, as well as supporting evidence that lead you to believe the hypothesis can be confirmed:
HYPOTHESIS: <Your hypothesis for what the solution to the task is>
SUPPORTING EVIDENCE:
- <a bullet form list>
- <of evidence that you used>
- <to arrive at that hypothesis>
When a game scenario is completed, it will automatically notify you, and then begin saving log files.
If you reach a time limit or decide to quit early, the game will also save log files before you exit.
If you successfully complete a game scenario, it will offer you the opportunity to play for a bit longer by pressing F5, in case you need to gather a bit more evidence for your notes.
If, for whatever reason, you feel you need to try a scenario again (say, from making an error in your first attempt), you're welcome to. We ask you to send the log files from each attempt, so we can record that it took multiple attempts to solve successfully.
The userstudy.py
user interface saves extensive logs after each run, including the full game state at each step, the user actions, and frame captures of the game at each step (to assemble a video). These are stored in the logs
subdirectory. A new subdirecty is made each time you play a scenario. After the log files are saved, the entire log file directory will be automatically archived (zipped up).
The following controls are supported. Note that these are also accessible in-game by pressing the ?
key:
- Arrow keys for movement:
left/right
keys rotate the agent,up/down
move forward/backward - Arguments: The objects the agent interacts with are specified by the argument boxes, at the bottom. Use number keys to select specific inventory item of the top argument box, hold shift + number keys for the bottom argument box. Alternatively,
[
and]
cycle the selection through the top argument box, and;
and'
for the bottom argument box. - TAB: View the current task information.
- Pick up object:
Space
will attempt to pick up the object inarg1
- Drop object:
d
will drop the object inarg1
- Put object in container:
p
will attempt to put the object inarg1
in the container inarg2
- Give object to another character:
p
will attempt to give the object inarg1
to the character inarg2
- Open/Close:
o
andc
will attempt to open/closearg1
- Activate/Deactivate:
a
ands
will attempt to activate/deactivatearg1
- Use:
u
will attempt to usearg1
onarg2
(e.g. use shovel on soil) - Talk:
t
will attempt to talk to the agent inarg1
- Read:
r
will read the object inarg1
- Eat:
e
will eatarg1
- Wait:
w
will do nothing. - DiscoveryFeed:
v
will view the most recent posts on the Discovery Feed. - Help:
?
orF1
to display help message. - Quit:
ESC
will exit.
The log files are located in the /logs/
subdirectory of DiscoveryWorld.
Each time you play a scenario, it will make a new directory for the logs. When you're finished playing (either by completing the scenario, or stopping playing), it will save the logs. This may take a minute or two, as the logs can be rather large. The logs are then ZIPPED up into a single file (per game session you played), to make them easy to transmit.
Note that if you played a scenario more than once (e.g. played the reactor lab
scenario several times), there will be one ZIP file for each time you played.
The log files contain near-complete recordings of the game state at each turn, to keep track of what objects were present, what actions you took, and how long it took to complete each task. The logs also contain the renderings (i.e. screenshots) of each frame in the game (in the /frames/
subdirectory of each log directory).
The log includes the time you started the task, and the amount of time it took to complete the task.
To save space (the logs can be quite large), they are also archived within their respective directories in 100-turn increments. You're welcome to inspect their contents -- they should be viewable with any text editor that can view large JSON files.
Please include the notes you made in the text editor for each scenario.
Please send:
- The ZIP files for each scenario you played
- The text notes you made for each scenario
For any questions, please contact Erin Bransom ([email protected]).