From 3d8d71fff518483ad5eb442399118f40bf1af780 Mon Sep 17 00:00:00 2001
From: Ravi Pandya Method Overview
- (left) Before solving the reach-avoid game, we specify the target set (goal locations), failure set (collisions), and a conditional behavior prediction (CBP) model that can predict the human's future trajectory conditioned on the robot's future plan. (center) During simulated gameplay, the SLIDE policy, , is trained against a simulated human adversary whose control bounds are informed by the CBP model. (right) Online, the robot uses its robust SLIDE policy to safely influence against any> human. + (left) Before solving the reach-avoid game, we specify the target set (goal locations), failure set (collisions), and a conditional behavior prediction (CBP) model that can predict the human's future trajectory conditioned on the robot's future plan. (center) During simulated gameplay, the SLIDE policy, , is trained against a simulated human adversary whose control bounds are informed by the CBP model. (right) Online, the robot uses its robust SLIDE policy to safely influence against any human.