Deep Reinforcement Learning with a Natural Language Action Space

ghtwht · Aug 11, 2016 · 397f656 · 397f656
1 parent c541128
commit 397f656
Show file tree

Hide file tree

Showing 2 changed files with 4 additions and 1 deletion.
diff --git a/README.md b/README.md
@@ -184,7 +184,7 @@ Vision
 
 NLP
 
-- Deep Reinforcement Learning with a Natural Language Action Space [[arXiv](https://arxiv.org/abs/1511.04636)]
+- [Deep Reinforcement Learning with a Natural Language Action Space](notes/drl-nlp-action.md) [[arXiv](https://arxiv.org/abs/1511.04636)]
 - Sequence Level Training with Recurrent Neural Networks [[arXiv](http://arxiv.org/abs/1511.06732)]
 - [Teaching Machines to Read and Comprehend](notes/teaching-machines-to-read-and-comprehend.md) [[arxiv](http://arxiv.org/abs/1506.03340)]
 - [Semi-supervised Sequence Learning](notes/semi-supervised-sequence-learning.md) [[arXiv](http://arxiv.org/abs/1511.01432)]

diff --git a/notes/drl-nlp-action.md b/notes/drl-nlp-action.md
@@ -0,0 +1,3 @@
+## [Deep Reinforcement Learning with a Natural Language Action Space](notes/drl-nlp-action.md) [[arXiv](https://arxiv.org/abs/1511.04636)]
+
+TLDR; The authors train a DQN on text-based games. The main difference is that their Q-Value functions embeds the state (textual context) and action (text-based choice) separately and then takes the dot product between them. The authors call this a Deep Reinforcement Learning Relevance network. Basically, just a different Q function implementation. Empirically, the authors show that their network can learn to solve "Saving John" and "Machine of Death" text games.
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,3 @@
		## [Deep Reinforcement Learning with a Natural Language Action Space](notes/drl-nlp-action.md) [[arXiv](https://arxiv.org/abs/1511.04636)]

		TLDR; The authors train a DQN on text-based games. The main difference is that their Q-Value functions embeds the state (textual context) and action (text-based choice) separately and then takes the dot product between them. The authors call this a Deep Reinforcement Learning Relevance network. Basically, just a different Q function implementation. Empirically, the authors show that their network can learn to solve "Saving John" and "Machine of Death" text games.