4 FAQ

Additional documentation:

4.1 I get different results when reproducing the provided checkpoints

CARLA evaluation results vary between runs. Typical variations we observe: ~1-2 DS on Bench2Drive, ~5-7 DS on Longest6 v2, ~1.0 DS on Town13. These are empirical estimates and actual variance depends on system configuration and randomness.

4.2 Why are there multiple versions of `leaderboard` and `scenario_runner`?

Each benchmark requires its own evaluation protocol, which means separate forks of those repositories. The expert data collector also uses its own fork.

4.3 How do I create more routes?

See carla_route_generator and LEAD’s supplemental.

4.4 Where can I see the modifications to `leaderboard` and `scenario_runner`?

Our forks with modifications:

4.5 Which TransFuser versions are available?

See this list.

4.6 How often does CARLA fail to start?

Roughly 10% of launch attempts may fail due to startup hangs, port conflicts, or GPU initialization issues. This is typical CARLA behavior.

Recovery steps:

Clean zombie processes: bash scripts/clean_carla.sh
Restart: bash scripts/start_carla.sh
Verify ports 2000-2002 are available
Docker: docker compose restart carla

4.7 How do I add custom scenarios?

See 3rd_party/scenario_runner_autopilot/srunner/scenarios.

4.8 How does the expert access scenario-specific data?

See 3rd_party/scenario_runner_autopilot/srunner/scenariomanager/carla_data_provider.py.