This repository contains code and data associated with paper "Learning Generalizable Vision-Tactile Robotic Grasping Strategy for Deformable Objects via Transformer". Our paper link is: , and if you are interested in our work and would like to cite our paper, please use the following citation format:
@misc{Han2021Grasping,
title={Learning Generalizable Vision-Tactile Robotic Grasping Strategy for Deformable Objects via Transformer},
author={Yunhai Han and Rahul Batra and Nathan Boyd and Tuo Zhao and Yu She and Seth Hutchinson and Ye Zhao},
year={2021},
eprint={2112.06374},
archivePrefix={arXiv},
primaryClass={cs.RO}
}
-
Slip Detection
Work done as part of the paper "Slip Detection with Combined Tactile and Vision Information". They collected the slip dataset and trained a CNN+LSTM model to detect slip. We showed that the Transformer models outperform the CNN+LSTM model in term of accuracy and efficiency. See their paper link: https://arxiv.org/abs/1802.10153 ; See a GitHub project link: https://github.com/wkoa/slip_detection (not released from the authors) -
Grasping Framework
Original work done as part of our paper. we propose a Transformer-based robotic grasping framework for rigid grippers that leverage tactile and visual information for safe object grasping. Specificaylly, our framework can predict the grasping outcome given a grasping force threshold and to estimate the force threshold for safe grasping through inference. -
Fruit classification
Original work done as part of our paper. Our framework can also classify the grasped fruit type using the same input data but with a different MLP network for the final fruit classification task. During training, the weights of the Transformer models frozen and only the MLP network is trained. Therefore, in order to train the model for fruit classification, make sure that you have already trained a grasping model and specify its path in the yaml file becuase the Transformer weights will be loaded.
There are two datasets 1) Slip Detection Dataset and 2) Grasping Dataset
Download the whole dataset from: https://drive.google.com/file/d/1_iTPxl8TENznXVh-82I26kW9qXrsNsv-/view. Due to some legacy reason, we only maintain a small clip of the dataset, which is not enough for training a new model. The link to this part of data is https://drive.google.com/file/d/1NPcZYStp2pLPyeWLwv3-jbltz04RUuSp/view?usp=drive_link.
Slip Dataset consists of video frames captured by Gelsight Tactile sensor and Realsense Camera both mounted on a robotic arm(which arm). Robot arm action is labelled either slipping or not slipping. Dataset consists of 89 everyday objects like bottles, torch, etc. Besides, as described in our paper, we randomly split the whole dataset into training, validation, test sub-dataset five times.
Gelsight | Camera |
Gelsight | Camera |
Gelsight | Camera |
Gelsight | Camera |
Gelsight | Camera |
Download data from: https://drive.google.com/file/d/144cLi-LkPZVHD_JHfJSC8mYAk6gISCNI/view https://drive.google.com/file/d/1qBGmeEmLYGI4gPBAbp3y8d3_Yl4u01RH/view?usp=drive_link.
Dataset consists of video frames captured by Gelsight Tactile sensor and Realsense Camera both mounted on a robotic arm(which arm). Robotic arm performs two actions pinching and sliding and frameset are captured for both actions for different fruits. For each frameset a fruit label is also provided. Dataset has 6 categories of fruits. Check more details in the whole dataset.
Pinching Gelsight | Pinching Camera | Sliding Gelsight | Sliding Camera |
Pinching Gelsight | Pinching Camera | Sliding Gelsight | Sliding Camera |
Pinching Gelsight | Pinching Camera | Sliding Gelsight | Sliding Camera |
Pinching Gelsight | Pinching Camera | Sliding Gelsight | Sliding Camera |
Pinching Gelsight | Pinching Camera | Sliding Gelsight | Sliding Camera |
Pinching Gelsight | Pinching Camera | Sliding Gelsight | Sliding Camera |
* python3 >= 3.8.10
* numpy >= 1.21.1
* pytorch >= 1.9.0+cpu (cpu training&testing) or 1.9.0+cu102 (cuda training&testing)
* opencv >= 4.5.3
* yaml >= 5.4.1
* json >= 2.0.9
* matplotlib >= 3.4.2
1. Download slip detection data(See download instructions above)
2. cd src/slip_detection
3. python main_cluster.py config_cluster.yaml #Specify the data path and the model type in the yaml file
1. Download fruit grasping data(See download instructions above)
2. cd src/grasping_framework
3. python main_cluster.py config_cluster.yaml #Specify the data path and the model type in the yaml file
1. Download fruit grasping data(See download instructions above)
2. You can either download pretrained models or train a grapsing model(See Grasping Training instructions below) on your end.
* Download a pre-trained TimeSofmer model from: https://www.dropbox.com/s/1vk8gy5e43y3xkz/timeSformer_orig_two_fruit190.pt?dl=0
* Download a pre-trained viviT model from: https://www.dropbox.com/s/k8su3ayzolnoxyj/vivit_fdp_two_fruit_last.pt?dl=0
3. cd src/fruit_classification
4. python main.py config_fruit.yaml #Specify the data path and the model type in the yaml file