A curated list of world model for autonmous driving. Keep updated.
Besides the wonderful papers we list below, we are very happy to announce that our group, NYU Learning Systems Laboratory, recently released a preprint titled: AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR Data, the first joint-embedding predictive architecture (JEPA) based spatial world models for self-supervised representation learning of autonomous driving scenarios with LiDAR data. We'll release the source code, pretrained models (both self-supervised and supervised, except for models with the Waymo dataset due to Waymo's licensing constraints), and training logs at AD-L-JEPA-Release by the end of January 2025. Please stay tuned for more updates!
-
2025-AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR Data.
arxiv
;Pre-training
;Self-supervised representation learning
; Paper, Code to be released -
2025-Cosmos World Foundation Model Platform for Physical AI Paper, Code
-
2024-DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT. Paper Project Page Code
-
2024-DOME: Taming Diffusion Model into High-Fidelity Controllable Occupancy World Model. Paper Project Page
-
2024-DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation Paper
-
2024-DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model Paper
Dataset
-
2024-Mitigating Covariate Shift in Imitation Learning for Autonomous Vehicles Using Latent Space Generative World Models Paper
Planning
-
2024-OccLLaMA: An Occupancy-Language-Action Generative World Model for Autonomous Driving Paper
-
2024-Drive-OccWorld: Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving Paper
-
2024-CarFormer: Self-Driving with Learned Object-Centric Representations
ECCV 2024
Paper -
2024-BEVWorld: A Multimodal World Model for Autonomous Driving via Unified BEV Latent Space
arxiv
Paper -
2024-Planning with Adaptive World Models for Autonomous Driving
arxiv
;Planning
; Paper -
2024-UnO: Unsupervised Occupancy Fields for Perception and Forecasting Paper
-
2024-LAW: Enhancing End-to-End Autonomous Driving with Latent World Model Paper
-
2024-OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving Paper, Code
-
2024-Delphi: Unleashing Generalization of End-to-End Autonomous Driving with Controllable Long Video Generation Paper
-
2024-Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability
from Shanghai AI Lab
Paper -
2024-DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving
CVPR 2024
; __Paper, -
2024-UniPAD: A Universal Pre-training Paradigm for Autonomous Driving
CVPR 2024
;from Shanghai AI Lab
Paper, Code -
2024-GenAD: Generalized Predictive Model for Autonomous Driving
CVPR 2024
;from Shanghai AI Lab
Paper -
2024-Think2Drive: Efficient Reinforcement Learning by Thinking in Latent World Model for Quasi-Realistic Autonomous Driving
arxiv
Paper -
2024-ViDAR: Visual Point Cloud Forecasting enables Scalable Autonomous Driving
CVPR 2024
;Pre-training
;from Shanghai AI Lab
;NuScenes dataset
Paper, Code -
2024-Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion
ICLR 2024
;Future Prediction
;from Waabi
;NuScenes, KITTI Odemetry, Argoverse2 Lidar datasets
Paper -
2023-DrivingDiffusion: Layout-Guided multi-view driving scene video generation with latent diffusion model
arxiv
;Generative AI
Paper, Code -
2023-MUVO: A Multimodal Generative World Model for Autonomous Driving with Geometric Representations
arxiv
;Pre-training
;CARLA dataset
Paper -
2023-Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving
arxiv
;Generative AI, Planning
;NuScenes and Waymo datasets
Paper -
2023-ADriver-I: A General World Model for Autonomous Driving
arxiv
;Generative AI
;NuScenes & one private dataset
Paper -
2023-OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving
arxiv
;Occupancy Future Prediction, Planning
;Occ3D dataset for Occupancy Future Prediction, NuScenes for motion planning
Paper, Code -
2023-GAIA-1: A Generative World Model for Autonomous Driving
arxiv
;Generative AI
;Wayve's private data
PaperRelated papers & tutorials to understand this paper:FDM for video diffusion decoder: Paper, Code
Denoising diffusion tutorials: CVPR 2022 tutorial, class from UC Berkeley, Video
-
2023-DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving
arxiv
;Generative AI
;NuScenes dataset
Paper, Code (To be released soon) -
2023-Neural World Models for Computer Vision 'PhD Thesis';
from Wayve
Paper -
2023-UniWorld: Autonomous Driving Pre-training via World Models
arxiv
;Pre-training
;NuScenes dataset
Paper -
2022-Separating the World and Ego Models for Self-Driving
ICLR 2022 workshop on Generalizable Policy Learning in the Physical World
;from Yann Lecun's Group
Paper, Code -
2022-SEM2: Enhance Sample Efficiency and Robustness of End-to-end Urban Autonomous Driving via Semantic Masked World Model
NeurIPS 2022 Deep Reinforcement Learning Workshop
;RL
;CARLA dataset
Paper -
2022-MILE: Model-Based Imitation Learning for Urban Driving
NeurIPS 2022
;RL
;from Wayve
Paper, Code -
2022-Iso-Dream: Isolating and Leveraging Noncontrollable Visual Dynamics in World Models
NeurIPS 2022
Paper, Code -
2021-FIERY: Future Instance Prediction in Bird's-Eye View from Surround Monocular Cameras
ICCV 2019
;Future Prediction
;from Wayve
;NuScenes, Lyft datasets
Paper, Code -
2021-Learning to drive from a world on rails
CVPR 2021 Oral
;RL
Paper, Project Page, Code -
2019-Model-Predictive Policy Learning with Uncertainty Regularization for Driving in Dense Traffic
ICLR 2019
;Future Prediction
;from Yann Lecun's Group
Paper, Code
- 2024-1X World Model Challenge
Challenges
Link - 2024-CVPR Workshop, Foundation Models for Autonomous Systems, Challenges, Track 4: Predictive World Model
Challenges
Link
- 2025-A Survey of World Models for Autonomous Driving
arxiv
Paper - 2024-World Models for Autonomous Driving: An Initial Survey
arxiv
Paper - 2024-Data-Centric Evolution in Autonomous Driving: A Comprehensive Survey of Big
Data System, Data Mining, and Closed-Loop Technologies
arxiv
Paper - 2024-Forging Vision Foundation Models for Autonomous Driving: Challenges, Methodologies, and Opportunities
arxiv
Paper
- 2025-Do generative video models learn physical principles from watching videos? Paper, Code, Website
- 2024-Genie2: Website
- 2024-WHALE: Towards Generalizable and Scalable World Models for Embodied Decision-making Paper
- 2024-How Far is Video Generation from World Model: A Physical Law Perspective Paper
- 2024-PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation
NeurIPS 2024
Paper - 2024-RoboDreamer: Learning Compositional World Models for Robot Imagination Paper
- 2024-TD-MPC2: Scalable, Robust World Models for Continuous Control
ICLR 2024
Paper - 2024-Hierarchical World Models as Visual Whole-Body Humanoid Controllers Paper
- 2024-Pandora: Towards General World Model with Natural Language Actions and Video States Paper
- 2024-Efficient World Models with Time-Aware and Context-Augmented Tokenization
ICML 2024
- 2024-3D-VLA: A 3D Vision-Language-Action Generative World Model
ICML 2024
Paper - 2024-Newton from Archetype AI
website
Link - 2024-MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
arxiv
Paper, Code - 2024-IWM: Learning and Leveraging World Models in Visual Representation Learning
arxiv
,from Yann Lecun's Group
Paper - 2024-Video as the New Language for Real-World Decision Making
arxiv
,Deepmind
Paper - 2024-Genie: Generative Interactive Environments
Deepmind
Paper, Website - 2024-Sora
OpenAI
,Generative AI
Link, Technical Report - 2024-LWM: World Model on Million-Length Video And Language With RingAttention
arxiv
;Generative AI
Paper, Code - 2024-WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
arxiv
;Generative AI
Paper - 2024-Video prediction models as rewards for reinforcement learning
NeurIPS 2024
Paper, Code - 2024-V-JEPA: Revisiting Feature Prediction for Learning Visual Representations from Video
from Yann Lecun's Group
Paper, Code - 2023-STORM: Efficient Stochastic Transformer based World Models for Reinforcement Learning
NeurIPS 2023
Paper, Code - 2023-Facing Off World Model Backbones: RNNs, Transformers, and S4
NeurIPS 2023
Paper - 2023-I-JEPA: Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture
CVPR 2023
;from Yann Lecun's Group
Paper, Code - 2023-Temporally Consistent Transformers for Video Generation
ICML 2023
Paper, Code - 2023-Learning to Model the World with Language
arxiv
Paper, Code - 2023-Transformers are sample-efficient world models
ICLR 2023
;RL
Paper, Code - 2023-Gradient-based Planning with World Models
arxiv
;from Yann Lecun's Group
;Planning
; Paper - 2023-World Models via Policy-Guided Trajectory Diffusion
arxiv
;RL
; Paper - 2023-DreamerV3: Mastering diverse domains through world models
arxiv
;RL
; Paper, Code - 2022-Daydreamer: World models for physical robot learning
CoRL 2022
;Robotics
Paper, Code - 2022-Masked World Models for Visual Control
CoRL 2022
;Robotics
Paper, Code - 2022-A Path Towards Autonomous Machine Intelligence
openreview
;from Yann Lecun's Group
;General Roadmap for World Models
; Paper; Slides1, Slides2, Slides3; Videos - 2021-LEXA:Discovering and Achieving Goals via World Models
NeurIPS 2021
; Paper, Website & Code - 2021-DreamerV2: Mastering Atari with Discrete World Models
ICLR 2021
;RL
;from Google & Deepmind
Paper, Code - 2020-Dreamer: Dream to Control: Learning Behaviors by Latent Imagination
ICLR 2020
Paper, Code - 2019-Learning Latent Dynamics for Planning from Pixels
ICML 2019
Paper, Code - 2018-Model-Based Planning with Discrete and Continuous Actions
arxiv
;RL, Planning
;from Yann Lecun's Group
; Paper - 2018-Recurrent world models facilitate policy evolution
NeurIPS 2018
; Paper, Code
- 2023-Occupancy Prediction-Guided Neural Planner for Autonomous Driving
ITSC 2023
;Planning, Neural Predicted-Guided Planning
;Waymo Open Motion dataset
Paper
Awesome-World-Model, Awesome-World-Models-for-AD , World models paper list from Shanghai AI lab, Awesome-Papers-World-Models-Autonomous-Driving.