DriveX 🚗

Workshop on Foundation Models for

V2X-Based Cooperative Autonomous Driving

In conjunction with CVPR 2025, June 11-15 in Nashville, USA

Introduction

The DriveX Workshop explores the integration of foundation models and V2X-based cooperative systems to improve perception, planning, and decision-making in autonomous vehicles. While traditional single-vehicle systems have advanced tasks like 3D object detection, emerging challenges like holistic scene understanding and 3D occupancy prediction require more comprehensive solutions. Collaborative driving systems, utilizing V2X communication and roadside infrastructure, extend sensory range, provide hazard warnings, and improve decision-making through shared data. Simultaneously, foundation models like Vision-Language Models (VLMs) offer generalization abilities, enabling zero-shot learning, open-vocabulary recognition, and scene explanation for novel scenarios. Recent advancements in end-to-end systems and foundation models like DriveLLM further enhance autonomous systems. The workshop aims to bring together experts to explore these technologies, address challenges, and advance road safety.

Topics

  • Foundation Models for Cooperative Autonomous Driving and Intelligent Transportation Systems
  • Vision-Language Models (VLMs) for Traffic Scene Understanding
  • Large Language Model (LLM)-assisted Cooperative Systems
  • Communication-Efficient Cooperative Perception for Autonomous Vehicles
  • Efficient and Intelligent Vehicle-to-everything (V2X) Communication
  • Dataset Curation and Data Labeling for Autonomous Driving
  • Datasets and Benchmarks for Foundation Models and Cooperative Perception
  • 3D Object Detection and Semantic Segmentation of Vulnerable Road Users (VRUs)
  • 3D Occupancy Prediction and Scene Understanding
  • End-to-end Perception and Real-time Decision-Making Systems
  • Vehicle-to-Infrastructure (V2I) Interaction
  • Safety and Standards in Autonomous Driving

Schedule

08:20 - 08:30 Opening Remarks (Welcome & Introduction)
08:30 - 08:50 Keynote 1: Prof. Dr. Wolfram Burgard (Uni. of Technology Nuremberg, Germany)
08:50 - 09:10 Keynote 2: Prof. Dr. Daniel Cremers (Technical Uni. of Munich, Germany)
09:10 - 09:30 Keynote 3: Dr. Mingxing Tan (Waymo, USA)
09:30 - 09:50 Keynote 4: Prof. Dr. Philipp Krähenbühl (University of Texas at Austin)
09:50 - 10:10 Keynote 5: (remote) Prof. Dr. Cyrill Stachniss (University of Bonn, Germany)
10:10 - 10:20 Poster Session & Coffee Break
10:20 - 10:40 Keynote 6: Prof. Dr. Trevor Darrell (University of California Berkeley, USA)
10:40 - 11:00 Keynote 7: Prof. Dr. Marco Pavone (NVIDIA & Stanford University, USA)
11:00 - 11:20 Keynote 8: Prof. Dr. Laura Leal-Taixé (NVIDIA & Technical University of Munich, Italy/Germany)
11:20 - 12:00 Panel Discussion I
12:00 - 13:00 Lunch Break
13:00 - 13:20 Keynote 9: Prof. Dr. Angela Dai (Technical University of Munich, Germany)
13:20 - 13:40 Keynote 10: Prof. Dr. Cheng Feng (New York University, USA)
13:40 - 14:00 Keynote 11: Prof. Dr. Jiaqi Ma (University of California Los Angeles, USA)
14:00 - 14:20 Keynote 12: Prof. Dr. Long Cheng (Wayve, UK)
14:20 - 14:40 Keynote 13: Prof. Dr. Manabu Tsukada (University of Tokyo, Japan)
14:40 - 15:00 Keynote 14: Katie Luo (Cornell University, USA)
15:00 - 15:20 Keynote 15: Maria Lyssenko (BOSCH & Technical University of Munich, Germany)
15:20 - 16:00 Panel Discussion II
16:00 - 16:10 Poster Session II & Coffee Break
16:10 - 16:20 Oral Paper Presentation 1
16:20 - 16:30 Oral Paper Presentation 2
16:30 - 16:40 Oral Paper Presentation 3
16:40 - 16:50 Oral Paper Presentation 4
16:50 - 17:00 Oral Paper Presentation 5
17:00 - 17:15 Poster Session III & Coffee Break
17:15 - 17:30 Best Paper Award Presentation & Best Paper Awards
17:30 - 17:45 Competition Winner Presentation & Competition Awards
17:45 - 18:00 Closing Remarks, Group Picture & Networking

Keynote Speakers
















Paper Track

We accept novel full 8-page papers for publication in the proceedings, and either shorter 4-page extended abstracts or 8-page papers of novel or previously published work that will not be included in the proceedings. Full papers should use the official LaTeX or Typst CVPR 2025 template. Extended abstracts are not subject to the CVPR rules, so they can be in any template but, as a rule to not be considered a publication in terms of double submission policies, they should be 4 pages in CVPR template format.

Paper Awards

Challenge

We host a challenge based on the TUMTraf V2X Cooperative Perception Dataset (CVPR'24) which includes high-quality, real-world V2X perception data for the cooperative 3D object detection and tracking task in autonomous driving. The dataset is available here. We provide a dataset development kit to work with the dataset.

Competition Timeline:
The best-performing teams will be invited to present their solutions during the workshop, and the winners will receive prizes and recognition for their contributions to the field.

Challenge Awards

Organizers

Sponsors

We sincerely thank Qualcomm for their generous sponsorship of our workshop. Their support enables us to recognize outstanding research through prestigious awards.
We are currently seeking for further sponsorship opportunities and would be delighted to discuss potential collaborations. Interested parties are kindly requested to contact us via email at walter.zimmer@cs.tum.edu for further details.