DriveX 🚗

Workshop on Foundation Models for

V2X-Based Cooperative Autonomous Driving

In conjunction with CVPR 2025, June 11-15 in Nashville, USA

Introduction

The DriveX Workshop explores the integration of foundation models and V2X-based cooperative systems to improve perception, planning, and decision-making in autonomous vehicles. While traditional single-vehicle systems have advanced tasks like 3D object detection, emerging challenges like holistic scene understanding and 3D occupancy prediction require more comprehensive solutions. Collaborative driving systems, utilizing V2X communication and roadside infrastructure, extend sensory range, provide hazard warnings, and improve decision-making through shared data. Simultaneously, foundation models like Vision-Language Models (VLMs) offer generalization abilities, enabling zero-shot learning, open-vocabulary recognition, and scene explanation for novel scenarios. Recent advancements in end-to-end systems and foundation models like DriveLLM further enhance autonomous systems. The workshop aims to bring together experts to explore these technologies, address challenges, and advance road safety.

Topics

  • Foundation Models for Cooperative Autonomous Driving and Intelligent Transportation Systems
  • Vision-Language Models (VLMs) for Traffic Scene Understanding
  • Large Language Model (LLM)-assisted Cooperative Systems
  • Communication-Efficient Cooperative Perception for Autonomous Vehicles
  • Efficient and Intelligent Vehicle-to-everything (V2X) Communication
  • Dataset Curation and Data Labeling for Autonomous Driving
  • Datasets and Benchmarks for Foundation Models and Cooperative Perception
  • 3D Object Detection and Semantic Segmentation of Vulnerable Road Users (VRUs)
  • 3D Occupancy Prediction and Scene Understanding
  • End-to-end Perception and Real-time Decision-Making Systems
  • Vehicle-to-Infrastructure (V2I) Interaction
  • Safety and Standards in Autonomous Driving

Schedule

08:20 - 08:30 Opening Remarks (Welcome & Introduction)
08:30 - 08:50 Keynote 1: Maria Lyssenko (Technical Uni. of Munich, TUM & BOSCH, Germany)
08:50 - 09:10 Keynote 2: Prof. Dr. Jiaqi Ma (Uni. of California, Los Angeles, UCLA, USA)
09:10 - 09:30 Keynote 3: Dr. Mingxing Tan (Waymo, USA)
09:30 - 09:50 Keynote 4: Prof. Dr. Philipp Krähenbühl (University of Texas at Austin)
09:50 - 10:10 Keynote 5: (remote) Prof. Dr. Cyrill Stachniss (University of Bonn, Germany)
10:10 - 10:20 Poster Session & Coffee Break
10:20 - 10:40 Keynote 6: Prof. Dr. Trevor Darrell (University of California Berkeley, USA)
10:40 - 11:00 Keynote 7: Prof. Dr. Marco Pavone (NVIDIA & Stanford University, USA)
11:00 - 11:20 Keynote 8: Prof. Dr. Laura Leal-Taixé (NVIDIA & Technical University of Munich, Italy/Germany)
11:20 - 12:00 Panel Discussion I
12:00 - 13:00 Lunch Break
13:00 - 13:20 Keynote 9: Prof. Dr. Angela Dai (Technical University of Munich, Germany)
13:20 - 13:40 Keynote 10: Prof. Dr. Cheng Feng (New York University, USA)
13:40 - 14:00 Keynote 11: Prof. Dr. Wolfram Burgard (Uni. of Technology Nuremberg, UTN, Germany)
14:00 - 14:20 Keynote 12: Dr. Long Chen (Wayve, UK)
14:20 - 14:40 Keynote 13: Prof. Dr. Manabu Tsukada (University of Tokyo, Japan)
14:40 - 15:00 Keynote 14: Katie Luo (Cornell University, USA)
15:00 - 15:20 Keynote 15: Prof. Dr. Daniel Cremers (Technical Uni. of Munich, TUM, Germany)
15:20 - 15:40 Keynote 15: Prof. Dr. Manmohan Chandraker (University of California San Diego, UCSD, USA)
15:40 - 16:20 Panel Discussion II
16:20 - 16:30 Oral Paper Presentation 1
16:30 - 16:40 Oral Paper Presentation 2
16:40 - 16:50 Oral Paper Presentation 3
16:50 - 17:00 Oral Paper Presentation 4
17:00 - 17:10 Best Paper Award Presentation & Best Paper Awards
17:10 - 17:20 Competition Winner Presentation & Competition Awards
17:20 - 17:30 Group Picture (all Organizers and Speakers)
17:30 - 18:00 Poster Session
18:00 - 20:00 Social Mixer, Networking, Dinner (Location will be announced during the workshop)

Keynote Speakers

















Paper Track

We accept novel papers (8 pages excl. references) for publication in the proceedings, and shorter extended abstracts (4 pages excl. references). Full papers should use the official LaTeX or Typst CVPR 2025 template. For extended abstracts we recommend to use the same template. Submission Portal: OpenReview
Submission Opens February 26, 2025
Paper Abstract Submission Deadline March 25, 2025 (23:59 PST) March 26, 2025 (23:59 GMT)
Paper Submission Deadline March 25, 2025 (23:59 PST) March 26, 2025 (23:59 GMT)
Notification to Authors March 31, 2025
Camera-ready Submission April 7, 2025 (23:59 GMT)

Paper Awards

Challenge

We host a 10-week-long challenge based on the TUMTraf V2X Cooperative Perception Dataset (CVPR'24) which includes high-quality, real-world V2X perception data for the cooperative 3D object detection and tracking task in autonomous driving. The dataset is available here. We provide a dataset development kit to work with the dataset. Our TUMTraf V2X dataset contains:

Competition Timeline:
Dataset Final Release: December 31, 2024
Competition Announcement January 8, 2025
Competition Starts (Leaderboard opens on EvalAI) March 17, 2025
Competition Ends (Leaderboard closes on EvalAI) May 26, 2025 (23:59 GMT)
Abstract Submission Deadline (via email) May 26, 2025 (23:59 GMT)
Challenge Report (4-8 pages) Submission Deadline (via email) June 1, 2025 (23:59 GMT)
Notification to Participants (1st, 2nd, 3rd Prize Winner Announcement) June 3, 2025
Video Presentation (10 min) Submission (via email) June 10, 2025 (23:59 GMT)

Evaluation Metrics:
Challenge Submission:
CoopDet3D generates detection (inference) files in OpenLABEL (.json) format (see example). In case you are developing your own model, please output the detection results in the OpenLABEL format following this schema. After you have generated the detection files for the test set (100 frames), please combine them into a single OpenLABEL (.json) file using this script. Test the evaluation locally on your machine using our evaluation script in the dev-kit, before submitting to the leaderboard. Finally, submit the accumulated OpenLABEL prediction file (submission.json) to the EvalAI Leaderboard platform. The best-performing teams will be invited to present their solutions during the workshop, and the winners will receive prizes and recognition for their contributions to the field.

Challenge Awards

Organizers

Invited Program Committee

Sponsors

We sincerely thank Qualcomm Inc., Virtual Vehicle Research GmbH, Abaka AI and Autodriving-Heart for their generous sponsorship of our workshop. Their support enables us to recognize outstanding research through prestigious awards.
We are currently seeking for further sponsorship opportunities and would be delighted to discuss potential collaborations. Interested parties are kindly requested to contact us via email at walter.zimmer@cs.tum.edu for further details.