Skip to content

HaoranZhuExplorer/World-Models-Autonomous-Driving-Latest-Survey

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 

Repository files navigation

World-Models-Autonomous-Driving-Latest-Survey

A curated list of world model for autonmous driving. Keep updated.

Announcement

Besides the wonderful papers we list below, we are very happy to announce that our group, NYU Learning Systems Laboratory, recently released a preprint titled: AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR Data, the first joint-embedding predictive architecture (JEPA) based spatial world models for self-supervised representation learning of autonomous driving scenarios with LiDAR data. We'll release the source code, pretrained models (both self-supervised and supervised, except for models with the Waymo dataset due to Waymo's licensing constraints), and training logs at AD-L-JEPA-Release by the end of January 2025. Please stay tuned for more updates!

Papers

  • 2025-AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR Data. arxiv; Pre-training; Self-supervised representation learning; Paper, Code to be released

  • 2025-Cosmos World Foundation Model Platform for Physical AI Paper, Code

  • 2024-DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT. Paper Project Page Code

  • 2024-DOME: Taming Diffusion Model into High-Fidelity Controllable Occupancy World Model. Paper Project Page

  • 2024-DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation Paper

  • 2024-DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model Paper Dataset

  • 2024-Mitigating Covariate Shift in Imitation Learning for Autonomous Vehicles Using Latent Space Generative World Models Paper Planning

  • 2024-OccLLaMA: An Occupancy-Language-Action Generative World Model for Autonomous Driving Paper

  • 2024-Drive-OccWorld: Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving Paper

  • 2024-CarFormer: Self-Driving with Learned Object-Centric Representations ECCV 2024 Paper

  • 2024-BEVWorld: A Multimodal World Model for Autonomous Driving via Unified BEV Latent Space arxiv Paper

  • 2024-Planning with Adaptive World Models for Autonomous Driving arxiv; Planning; Paper

  • 2024-UnO: Unsupervised Occupancy Fields for Perception and Forecasting Paper

  • 2024-LAW: Enhancing End-to-End Autonomous Driving with Latent World Model Paper

  • 2024-OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving Paper, Code

  • 2024-Delphi: Unleashing Generalization of End-to-End Autonomous Driving with Controllable Long Video Generation Paper

  • 2024-Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability from Shanghai AI Lab Paper

  • 2024-DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving CVPR 2024; __Paper,

  • 2024-UniPAD: A Universal Pre-training Paradigm for Autonomous Driving CVPR 2024; from Shanghai AI Lab Paper, Code

  • 2024-GenAD: Generalized Predictive Model for Autonomous Driving CVPR 2024; from Shanghai AI Lab Paper

  • 2024-Think2Drive: Efficient Reinforcement Learning by Thinking in Latent World Model for Quasi-Realistic Autonomous Driving arxiv Paper

  • 2024-ViDAR: Visual Point Cloud Forecasting enables Scalable Autonomous Driving CVPR 2024; Pre-training; from Shanghai AI Lab; NuScenes dataset Paper, Code

  • 2024-Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion ICLR 2024; Future Prediction; from Waabi; NuScenes, KITTI Odemetry, Argoverse2 Lidar datasets Paper

  • 2023-DrivingDiffusion: Layout-Guided multi-view driving scene video generation with latent diffusion model arxiv; Generative AI Paper, Code

  • 2023-MUVO: A Multimodal Generative World Model for Autonomous Driving with Geometric Representations arxiv; Pre-training; CARLA dataset Paper

  • 2023-Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving arxiv; Generative AI, Planning; NuScenes and Waymo datasets Paper

  • 2023-ADriver-I: A General World Model for Autonomous Driving arxiv; Generative AI; NuScenes & one private dataset Paper

  • 2023-OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving arxiv; Occupancy Future Prediction, Planning; Occ3D dataset for Occupancy Future Prediction, NuScenes for motion planning Paper, Code

  • 2023-GAIA-1: A Generative World Model for Autonomous Driving arxiv; Generative AI; Wayve's private data Paper

    Related papers & tutorials to understand this paper:

    FDM for video diffusion decoder: Paper, Code

    Denoising diffusion tutorials: CVPR 2022 tutorial, class from UC Berkeley, Video

  • 2023-DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving arxiv; Generative AI; NuScenes dataset Paper, Code (To be released soon)

  • 2023-Neural World Models for Computer Vision 'PhD Thesis'; from Wayve Paper

  • 2023-UniWorld: Autonomous Driving Pre-training via World Models arxiv; Pre-training; NuScenes dataset Paper

  • 2022-Separating the World and Ego Models for Self-Driving ICLR 2022 workshop on Generalizable Policy Learning in the Physical World; from Yann Lecun's Group Paper, Code

  • 2022-SEM2: Enhance Sample Efficiency and Robustness of End-to-end Urban Autonomous Driving via Semantic Masked World Model NeurIPS 2022 Deep Reinforcement Learning Workshop; RL; CARLA dataset Paper

  • 2022-MILE: Model-Based Imitation Learning for Urban Driving NeurIPS 2022; RL; from Wayve Paper, Code

  • 2022-Iso-Dream: Isolating and Leveraging Noncontrollable Visual Dynamics in World Models NeurIPS 2022 Paper, Code

  • 2021-FIERY: Future Instance Prediction in Bird's-Eye View from Surround Monocular Cameras ICCV 2019; Future Prediction; from Wayve; NuScenes, Lyft datasets Paper, Code

  • 2021-Learning to drive from a world on rails CVPR 2021 Oral; RL Paper, Project Page, Code

  • 2019-Model-Predictive Policy Learning with Uncertainty Regularization for Driving in Dense Traffic ICLR 2019; Future Prediction; from Yann Lecun's Group Paper, Code

Workshops/Challenges

  • 2024-1X World Model Challenge Challenges Link
  • 2024-CVPR Workshop, Foundation Models for Autonomous Systems, Challenges, Track 4: Predictive World Model Challenges Link

Tutorials/Talks/

  • 2023 from Wayve; Video
  • 2022-Neural World Models for Autonomous Driving Video

Surveys that Contain World Models for AD

  • 2025-A Survey of World Models for Autonomous Driving arxiv Paper
  • 2024-World Models for Autonomous Driving: An Initial Survey arxiv Paper
  • 2024-Data-Centric Evolution in Autonomous Driving: A Comprehensive Survey of Big Data System, Data Mining, and Closed-Loop Technologies arxiv Paper
  • 2024-Forging Vision Foundation Models for Autonomous Driving: Challenges, Methodologies, and Opportunities arxiv Paper

Other General World Model Papers

  • 2025-Do generative video models learn physical principles from watching videos? Paper, Code, Website
  • 2024-Genie2: Website
  • 2024-WHALE: Towards Generalizable and Scalable World Models for Embodied Decision-making Paper
  • 2024-How Far is Video Generation from World Model: A Physical Law Perspective Paper
  • 2024-PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation NeurIPS 2024 Paper
  • 2024-RoboDreamer: Learning Compositional World Models for Robot Imagination Paper
  • 2024-TD-MPC2: Scalable, Robust World Models for Continuous Control ICLR 2024 Paper
  • 2024-Hierarchical World Models as Visual Whole-Body Humanoid Controllers Paper
  • 2024-Pandora: Towards General World Model with Natural Language Actions and Video States Paper
  • 2024-Efficient World Models with Time-Aware and Context-Augmented Tokenization ICML 2024
  • 2024-3D-VLA: A 3D Vision-Language-Action Generative World Model ICML 2024 Paper
  • 2024-Newton from Archetype AI website Link
  • 2024-MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators arxiv Paper, Code
  • 2024-IWM: Learning and Leveraging World Models in Visual Representation Learning arxiv, from Yann Lecun's Group Paper
  • 2024-Video as the New Language for Real-World Decision Making arxiv, Deepmind Paper
  • 2024-Genie: Generative Interactive Environments Deepmind Paper, Website
  • 2024-Sora OpenAI, Generative AI Link, Technical Report
  • 2024-LWM: World Model on Million-Length Video And Language With RingAttention arxiv; Generative AI Paper, Code
  • 2024-WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens arxiv; Generative AI Paper
  • 2024-Video prediction models as rewards for reinforcement learning NeurIPS 2024 Paper, Code
  • 2024-V-JEPA: Revisiting Feature Prediction for Learning Visual Representations from Video from Yann Lecun's Group Paper, Code
  • 2023-STORM: Efficient Stochastic Transformer based World Models for Reinforcement Learning NeurIPS 2023 Paper, Code
  • 2023-Facing Off World Model Backbones: RNNs, Transformers, and S4 NeurIPS 2023 Paper
  • 2023-I-JEPA: Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture CVPR 2023; from Yann Lecun's Group Paper, Code
  • 2023-Temporally Consistent Transformers for Video Generation ICML 2023 Paper, Code
  • 2023-Learning to Model the World with Language arxiv Paper, Code
  • 2023-Transformers are sample-efficient world models ICLR 2023;RL Paper, Code
  • 2023-Gradient-based Planning with World Models arxiv; from Yann Lecun's Group; Planning; Paper
  • 2023-World Models via Policy-Guided Trajectory Diffusion arxiv; RL; Paper
  • 2023-DreamerV3: Mastering diverse domains through world models arxiv;RL; Paper, Code
  • 2022-Daydreamer: World models for physical robot learning CoRL 2022; Robotics Paper, Code
  • 2022-Masked World Models for Visual Control CoRL 2022; Robotics Paper, Code
  • 2022-A Path Towards Autonomous Machine Intelligence openreview; from Yann Lecun's Group; General Roadmap for World Models; Paper; Slides1, Slides2, Slides3; Videos
  • 2021-LEXA:Discovering and Achieving Goals via World Models NeurIPS 2021; Paper, Website & Code
  • 2021-DreamerV2: Mastering Atari with Discrete World Models ICLR 2021; RL; from Google & Deepmind Paper, Code
  • 2020-Dreamer: Dream to Control: Learning Behaviors by Latent Imagination ICLR 2020 Paper, Code
  • 2019-Learning Latent Dynamics for Planning from Pixels ICML 2019 Paper, Code
  • 2018-Model-Based Planning with Discrete and Continuous Actions arxiv; RL, Planning; from Yann Lecun's Group; Paper
  • 2018-Recurrent world models facilitate policy evolution NeurIPS 2018; Paper, Code

Other Related Papers

  • 2023-Occupancy Prediction-Guided Neural Planner for Autonomous Driving ITSC 2023; Planning, Neural Predicted-Guided Planning; Waymo Open Motion dataset Paper

Other Related Repos

Awesome-World-Model, Awesome-World-Models-for-AD , World models paper list from Shanghai AI lab, Awesome-Papers-World-Models-Autonomous-Driving.

About

A curated list of world models for autonomous driving. Keep updated.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •