World-Models-Autonomous-Driving-Latest-Survey

A curated list of world model for autonmous driving. Keep updated.

Announcement

Besides the wonderful papers we list below, we are very happy to announce that our group, NYU Learning Systems Laboratory, recently released a preprint titled: AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR Data, the first joint-embedding predictive architecture (JEPA) based spatial world models for self-supervised representation learning of autonomous driving scenarios with LiDAR data. We'll release the source code, pretrained models (both self-supervised and supervised, except for models with the Waymo dataset due to Waymo's licensing constraints), and training logs at AD-L-JEPA-Release by the end of January 2025. Please stay tuned for more updates!

Papers

2025-AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR Data. arxiv; Pre-training; Self-supervised representation learning; Paper, Code to be released
2025-Cosmos World Foundation Model Platform for Physical AI Paper, Code
2024-DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT. Paper Project Page Code
2024-DOME: Taming Diffusion Model into High-Fidelity Controllable Occupancy World Model. Paper Project Page
2024-DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation Paper
2024-DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model Paper Dataset
2024-Mitigating Covariate Shift in Imitation Learning for Autonomous Vehicles Using Latent Space Generative World Models Paper Planning
2024-OccLLaMA: An Occupancy-Language-Action Generative World Model for Autonomous Driving Paper
2024-Drive-OccWorld: Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving Paper
2024-CarFormer: Self-Driving with Learned Object-Centric Representations ECCV 2024 Paper
2024-BEVWorld: A Multimodal World Model for Autonomous Driving via Unified BEV Latent Space arxiv Paper
2024-Planning with Adaptive World Models for Autonomous Driving arxiv; Planning; Paper
2024-UnO: Unsupervised Occupancy Fields for Perception and Forecasting Paper
2024-LAW: Enhancing End-to-End Autonomous Driving with Latent World Model Paper
2024-OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving Paper, Code
2024-Delphi: Unleashing Generalization of End-to-End Autonomous Driving with Controllable Long Video Generation Paper
2024-Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability from Shanghai AI Lab Paper
2024-DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving CVPR 2024; __Paper,
2024-UniPAD: A Universal Pre-training Paradigm for Autonomous Driving CVPR 2024; from Shanghai AI Lab Paper, Code
2024-GenAD: Generalized Predictive Model for Autonomous Driving CVPR 2024; from Shanghai AI Lab Paper
2024-Think2Drive: Efficient Reinforcement Learning by Thinking in Latent World Model for Quasi-Realistic Autonomous Driving arxiv Paper
2024-ViDAR: Visual Point Cloud Forecasting enables Scalable Autonomous Driving CVPR 2024; Pre-training; from Shanghai AI Lab; NuScenes dataset Paper, Code
2024-Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion ICLR 2024; Future Prediction; from Waabi; NuScenes, KITTI Odemetry, Argoverse2 Lidar datasets Paper
2023-DrivingDiffusion: Layout-Guided multi-view driving scene video generation with latent diffusion model arxiv; Generative AI Paper, Code
2023-MUVO: A Multimodal Generative World Model for Autonomous Driving with Geometric Representations arxiv; Pre-training; CARLA dataset Paper
2023-Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving arxiv; Generative AI, Planning; NuScenes and Waymo datasets Paper
2023-ADriver-I: A General World Model for Autonomous Driving arxiv; Generative AI; NuScenes & one private dataset Paper
2023-OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving arxiv; Occupancy Future Prediction, Planning; Occ3D dataset for Occupancy Future Prediction, NuScenes for motion planning Paper, Code
2023-GAIA-1: A Generative World Model for Autonomous Driving arxiv; Generative AI; Wayve's private data Paper

Related papers & tutorials to understand this paper:
FDM for video diffusion decoder: Paper, Code

Denoising diffusion tutorials: CVPR 2022 tutorial, class from UC Berkeley, Video
2023-DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving arxiv; Generative AI; NuScenes dataset Paper, Code (To be released soon)
2023-Neural World Models for Computer Vision 'PhD Thesis'; from Wayve Paper
2023-UniWorld: Autonomous Driving Pre-training via World Models arxiv; Pre-training; NuScenes dataset Paper
2022-Separating the World and Ego Models for Self-Driving ICLR 2022 workshop on Generalizable Policy Learning in the Physical World; from Yann Lecun's Group Paper, Code
2022-SEM2: Enhance Sample Efficiency and Robustness of End-to-end Urban Autonomous Driving via Semantic Masked World Model NeurIPS 2022 Deep Reinforcement Learning Workshop; RL; CARLA dataset Paper
2022-MILE: Model-Based Imitation Learning for Urban Driving NeurIPS 2022; RL; from Wayve Paper, Code
2022-Iso-Dream: Isolating and Leveraging Noncontrollable Visual Dynamics in World Models NeurIPS 2022 Paper, Code
2021-FIERY: Future Instance Prediction in Bird's-Eye View from Surround Monocular Cameras ICCV 2019; Future Prediction; from Wayve; NuScenes, Lyft datasets Paper, Code
2021-Learning to drive from a world on rails CVPR 2021 Oral; RL Paper, Project Page, Code
2019-Model-Predictive Policy Learning with Uncertainty Regularization for Driving in Dense Traffic ICLR 2019; Future Prediction; from Yann Lecun's Group Paper, Code

Workshops/Challenges

2024-1X World Model Challenge Challenges Link
2024-CVPR Workshop, Foundation Models for Autonomous Systems, Challenges, Track 4: Predictive World Model Challenges Link

Tutorials/Talks/

2023 from Wayve; Video
2022-Neural World Models for Autonomous Driving Video

Surveys that Contain World Models for AD

2025-A Survey of World Models for Autonomous Driving arxiv Paper
2024-World Models for Autonomous Driving: An Initial Survey arxiv Paper
2024-Data-Centric Evolution in Autonomous Driving: A Comprehensive Survey of Big Data System, Data Mining, and Closed-Loop Technologies arxiv Paper
2024-Forging Vision Foundation Models for Autonomous Driving: Challenges, Methodologies, and Opportunities arxiv Paper

Other General World Model Papers

2025-Do generative video models learn physical principles from watching videos? Paper, Code, Website
2024-Genie2: Website
2024-WHALE: Towards Generalizable and Scalable World Models for Embodied Decision-making Paper
2024-How Far is Video Generation from World Model: A Physical Law Perspective Paper
2024-PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation NeurIPS 2024 Paper
2024-RoboDreamer: Learning Compositional World Models for Robot Imagination Paper
2024-TD-MPC2: Scalable, Robust World Models for Continuous Control ICLR 2024 Paper
2024-Hierarchical World Models as Visual Whole-Body Humanoid Controllers Paper
2024-Pandora: Towards General World Model with Natural Language Actions and Video States Paper
2024-Efficient World Models with Time-Aware and Context-Augmented Tokenization ICML 2024
2024-3D-VLA: A 3D Vision-Language-Action Generative World Model ICML 2024 Paper
2024-Newton from Archetype AI website Link
2024-MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators arxiv Paper, Code
2024-IWM: Learning and Leveraging World Models in Visual Representation Learning arxiv, from Yann Lecun's Group Paper
2024-Video as the New Language for Real-World Decision Making arxiv, Deepmind Paper
2024-Genie: Generative Interactive Environments Deepmind Paper, Website
2024-Sora OpenAI, Generative AI Link, Technical Report
2024-LWM: World Model on Million-Length Video And Language With RingAttention arxiv; Generative AI Paper, Code
2024-WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens arxiv; Generative AI Paper
2024-Video prediction models as rewards for reinforcement learning NeurIPS 2024 Paper, Code
2024-V-JEPA: Revisiting Feature Prediction for Learning Visual Representations from Video from Yann Lecun's Group Paper, Code
2023-STORM: Efficient Stochastic Transformer based World Models for Reinforcement Learning NeurIPS 2023 Paper, Code
2023-Facing Off World Model Backbones: RNNs, Transformers, and S4 NeurIPS 2023 Paper
2023-I-JEPA: Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture CVPR 2023; from Yann Lecun's Group Paper, Code
2023-Temporally Consistent Transformers for Video Generation ICML 2023 Paper, Code
2023-Learning to Model the World with Language arxiv Paper, Code
2023-Transformers are sample-efficient world models ICLR 2023;RL Paper, Code
2023-Gradient-based Planning with World Models arxiv; from Yann Lecun's Group; Planning; Paper
2023-World Models via Policy-Guided Trajectory Diffusion arxiv; RL; Paper
2023-DreamerV3: Mastering diverse domains through world models arxiv;RL; Paper, Code
2022-Daydreamer: World models for physical robot learning CoRL 2022; Robotics Paper, Code
2022-Masked World Models for Visual Control CoRL 2022; Robotics Paper, Code
2022-A Path Towards Autonomous Machine Intelligence openreview; from Yann Lecun's Group; General Roadmap for World Models; Paper; Slides1, Slides2, Slides3; Videos
2021-LEXA:Discovering and Achieving Goals via World Models NeurIPS 2021; Paper, Website & Code
2021-DreamerV2: Mastering Atari with Discrete World Models ICLR 2021; RL; from Google & Deepmind Paper, Code
2020-Dreamer: Dream to Control: Learning Behaviors by Latent Imagination ICLR 2020 Paper, Code
2019-Learning Latent Dynamics for Planning from Pixels ICML 2019 Paper, Code
2018-Model-Based Planning with Discrete and Continuous Actions arxiv; RL, Planning; from Yann Lecun's Group; Paper
2018-Recurrent world models facilitate policy evolution NeurIPS 2018; Paper, Code

Other Related Papers

2023-Occupancy Prediction-Guided Neural Planner for Autonomous Driving ITSC 2023; Planning, Neural Predicted-Guided Planning; Waymo Open Motion dataset Paper

Other Related Repos

Awesome-World-Model, Awesome-World-Models-for-AD , World models paper list from Shanghai AI lab, Awesome-Papers-World-Models-Autonomous-Driving.

Name		Name	Last commit message	Last commit date
Latest commit History 127 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

World-Models-Autonomous-Driving-Latest-Survey

Announcement

Papers

Workshops/Challenges

Tutorials/Talks/

Surveys that Contain World Models for AD

Other General World Model Papers

Other Related Papers

Other Related Repos

About

Releases

Packages

Contributors 3

HaoranZhuExplorer/World-Models-Autonomous-Driving-Latest-Survey

Folders and files

Latest commit

History

Repository files navigation

World-Models-Autonomous-Driving-Latest-Survey

Announcement

Papers

Workshops/Challenges

Tutorials/Talks/

Surveys that Contain World Models for AD

Other General World Model Papers

Other Related Papers

Other Related Repos

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Packages