-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: Andrej Orsula <[email protected]>
- Loading branch information
1 parent
8f33c16
commit e885454
Showing
5 changed files
with
16 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,12 @@ | ||
\chapter*{Summary} | ||
\addcontentsline{toc}{chapter}{Summary} | ||
|
||
In this work, deep reinforcement learning is applied for the task of vision-based robotic grasping with focus on generalisation to diverse objects in varying scenes. Model-free reinforcement learning is employed to learn an end-to-end policy that directly maps visual observations to continuous actions in Cartesian space. For observations, octrees are utilised in a novel approach to provide an efficient representation of the 3D scene. In order to allow agent to generalise over spatial positions and orientations, a 3D convolutional neural network is designed to extract abstract features. An agent is then trained by combining such feature extractor with off-policy actor-critic reinforcement learning algorithms. | ||
|
||
As training of robotics agents in real world is expensive and potentially unsafe, a new simulation environment for robotic grasping is created. This environment is developed on top of open-source Ignition Gazebo robotics simulator in order to provide high-fidelity physics and photorealistic rendering. Sim-to-real transfer of a learned policy is made possible by combining a dataset of realistic 3D scanned objects and textures with domain randomisation. Among others, this includes randomising the pose of a virtual RGB-D camera with aim to simplify the transfer of a simulated setup to real-world domain. | ||
|
||
Results of experimental evaluation indicate that deep reinforcement learning can be applied to learn an end-to-end policy with octree-based observations, while providing noteworthy advantages over traditionally used RGB and RGB-D images. On novel scenes with static camera pose, agent with octree observations is able to reach a success rate of~81.5\%, whereas agent with RGB-D observations and analogous feature extractor achieves~59\%. However, the advantage of 3D observations emerges with invariance to camera pose, where both RGB and RGB-D observations struggle to learn a policy while octrees still retain a success rate of~77\%. | ||
|
||
The same policy can be successfully transferred to a real robot without any need for retraining. On scenes with previously unseen everyday objects, a policy trained solely inside simulation can achieve success rate of~68.3\%. The invariance to camera pose enables a simple transfer without requiring the real-world setup to match its digital counterpart. In some cases, octree-based observations furthermore allow transfer of a policy trained on one robot to another with different gripper design and kinematic chain, while achieving almost identical performance to a policy that was trained on the target robot. | ||
|
||
Besides the aforementioned experiments, this work compares actor-critic algorithms TD3, SAC and TQC for continuous control, and studies benefits of several ablations and configurations such as the use of demonstrations, curriculum learning and proprioceptive observations. |
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters