Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the action sequence for the Zero-Shot Generalizable Rearrangement #9

Open
GostInShell opened this issue Jan 11, 2025 · 1 comment

Comments

@GostInShell
Copy link

Brilliant work!!!

I have some questions on predicting the action sequences for the Zero-Shot Generalizable Rearrangement task.
Here says the action sequences are planned via MPC.
image

  1. But I guess maybe a grasp pose estimator is still required.
  2. How is the manipulation strategy determined? For example, sometimes it chooes to pick and place, and other times push from the side.
  3. While, in this demo video, it seems the actions are generated by predicting several intermediate poses of the end-effector and connected by motion planning.

Generally, though the learned field generalizes well across novel objects. I assume some priors for action planning are still required. And I wonder do you have some common strategies to make the action suitable for your tasks.

Thanks in advance for any insights!

@WangYixuan12
Copy link
Owner

Hi, thank you for your kind words! Here are answers to your questions:

  1. Yes, you still need a grasping pose estimator. It can be an off-shelf grasping planner or a grasping planner implemented by yourself.
  2. We assume that only one skill will be used in one scenario. So it will only be push or pick/place.
  3. Yes, that's correct
    Your comment is correct. There are some priors for action planning. Current strategies for action selection are rather manual selection, which is also a common practice in the community for long-horizon tasks. I think it can be improved using TAMP or learned policy. I will shamelessly advertise our follow-up work GenDP here, which does not require manual skill selection and also uses the D3Fields.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants