A premier forum uniting academic, industry, and standards communities to explore advances in Foundation Models and 3D Perception in Cooperative Autonomous Driving (CAD).
The 7th edition of the full-day DriveX workshop explores advances in Foundation Models and 3D Perception in Cooperative Autonomous Driving (CAD). This workshop brings together leading researchers and practitioners to discuss cutting-edge developments in large language models (LLMs), vision-language models (VLMs), vision-language action models (VLAs), and their applications to autonomous driving systems. Topics include 3D object detection, semantic segmentation, sensor fusion, V2X communication, and cooperative perception.
We explore methods to enhance scene understanding, perception accuracy, dataset curation, and novelty detection. By uniting experts across perception, V2X, and foundation model domains, this workshop aims to foster innovation in cooperative autonomous driving and intelligent transportation systems. The workshop addresses critical challenges in multi-modal sensor data fusion, vehicle-infrastructure coordination, and intelligent transportation systems that leverage both onboard and roadside sensing capabilities.
This year, we expanded our focus with the addition of V2X applications, exploring real-world vehicle-to-infrastructure connectivity that extends past collaborative perception. The workshop provides a platform for discussing V2X for localization, tolling, road safety, monitoring, and data analytics, bridging the gap between theoretical advances and practical deployment in intelligent transportation systems. Through keynote presentations, panel discussions, paper presentations, and challenge tracks, DriveX 2026 creates a comprehensive forum for advancing the state-of-the-art in foundation model-driven cooperative autonomous driving.
| Start | End | Program | Speaker | Affiliation |
|---|---|---|---|---|
| 09:00 | 09:10 | Introduction | ||
| 09:10 | 09:30 | Keynote Presentation 1 Keynote | ||
| 09:30 | 09:50 | Keynote Presentation 2 Keynote | ||
| 09:50 | 10:10 | Keynote Presentation 3 Keynote | ||
| 10:10 | 10:25 | Coffee Break | ||
| 10:25 | 10:45 | Keynote Presentation 4 Keynote | ||
| 10:45 | 11:05 | Keynote Presentation 5 Keynote | ||
| 11:05 | 11:25 | Keynote Presentation 6 Keynote | ||
| 11:25 | 12:00 | Academic Panel Discussion Panel | ||
| 12:00 | 13:00 | Lunch | ||
| 13:00 | 13:20 | Keynote Presentation 7 Keynote | ||
| 13:20 | 13:40 | Keynote Presentation 8 Keynote | ||
| 13:40 | 14:00 | Keynote Presentation 9 Keynote | ||
| 14:00 | 14:20 | Keynote Presentation 10 Keynote | ||
| 14:20 | 15:00 | Industry Panel Discussion Panel | ||
| 15:00 | 15:15 | Coffee Break | ||
| 15:00 | 15:15 | Poster Session Posters | ||
| 15:15 | 17:45 | Paper Presentations Oral | ||
| 17:15 | 17:30 | Competition Winner Presentation & Awards Ceremony Competition | ||
| 17:35 | 17:45 | Best PaperPresentation & Awards Ceremony Best Paper | ||
| 17:45 | 17:55 | Final Remarks & Summary | ||
| 17:55 | 18:00 | Group Photo | ||
| 19:00 | 21:00 | Social Mixer Networking & Dinner |
Final schedule, room allocation, and speaker order will be announced closer to the workshop date.
V2I-Based Cooperative Perception
Infrastructure–vehicle fusion using
TUMTraf-V2X.
Focus on cooperative 3D detection and tracking with infrastructure-mounted LiDAR, radar, and cameras,
emphasizing occlusion handling, long-range awareness, and reliability under real-world conditions.
Natural Language Instruction for Human Interaction and Vision-Language Navigation
Built upon
doScenes.
Participants design models for natural language instructions and visual language navigation
to facilitate research on human-vehicle instruction interactions.
Multi-Agent Reasoning
Using
MDrive
teams explore a cooperative driving benchmark for end-to-end closed-loop multi-agent systems.
Competition Timeline
Top-performing teams will be invited to present at the workshop and will receive money prizes ($100) and award certificates. Detailed rules, baselines, and submission instructions are available on the official challenge page.
University of California, Riverside
Technical University of Munich
University of California, Los Angeles
The University of Hong Kong
The University of Sydney
The University of Sydney
The University of Sydney
Waymo
University of California, Los Angeles
DriveX 2026 welcomes sponsorship from industry, startups, and institutions interested in foundation models, cooperative perception, simulation, and large-scale autonomous driving systems.
For sponsorship opportunities, please contact: wz@ucla.edu.