Quantitative Evaluation of Autonomous Driving in CARLA
There have been many recent advancements in imitation and reinforcement learning for autonomous driving, but existing metrics generally lack the means to capture a wide range of driving behaviors and compare the severity of different failure cases. To address this shortcoming, we introduce Quan-tita...
Saved in:
Published in: | 2021 IEEE Intelligent Vehicles Symposium Workshops (IV Workshops) pp. 257 - 263 |
---|---|
Main Authors: | , , , |
Format: | Conference Proceeding |
Language: | English |
Published: |
IEEE
11-07-2021
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | There have been many recent advancements in imitation and reinforcement learning for autonomous driving, but existing metrics generally lack the means to capture a wide range of driving behaviors and compare the severity of different failure cases. To address this shortcoming, we introduce Quan-titative Evaluation for Driving (QED), which assesses different aspects of driving behavior including the ability to stay in the center of the lane, avoid weaving and erratic behavior, follow the speed limit, and avoid collisions. We compare scores generated by QED against scores assigned by human evaluators on 30 different drivers and 6 different towns in the CARLA driving simulator. In "easy" evaluation scenarios where better drivers are easily distinguished from worse drivers, QED attains 0.96 Pearson correlation and 0.97 Spearman correlation with human evaluators, similar to the baseline inter-human-evaluator 0.96 Pearson correlation and 0.95 Spearman correlation. In "hard" evaluation scenarios where ranking drivers is more ambiguous, QED attains 0.84 Pearson correlation and 0.74 Spearman correlation with human evaluators, slighter higher than the baseline inter-human-evaluator 0.78 Pearson correlation and 0.7 Spearman correlation. While QED may not capture every characteristic that defines good driving, we consider it an important foundation for reproducibility and standardization in the community. |
---|---|
DOI: | 10.1109/IVWorkshops54471.2021.9669240 |