Exploring Adversarial Robustness of Multi-Sensor Perception Systems in Self Driving
Modern self-driving perception systems have been shown to improve upon processing complementary inputs such as LiDAR with images. In isolation, 2D images have been found to be extremely vulnerable to adversarial attacks. Yet, there have been limited studies on the adversarial robustness of multi-mod...
Saved in:
Main Authors: | , , , , , , , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
17-01-2021
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Modern self-driving perception systems have been shown to improve upon
processing complementary inputs such as LiDAR with images. In isolation, 2D
images have been found to be extremely vulnerable to adversarial attacks. Yet,
there have been limited studies on the adversarial robustness of multi-modal
models that fuse LiDAR features with image features. Furthermore, existing
works do not consider physically realizable perturbations that are consistent
across the input modalities. In this paper, we showcase practical
susceptibilities of multi-sensor detection by placing an adversarial object on
top of a host vehicle. We focus on physically realizable and input-agnostic
attacks as they are feasible to execute in practice, and show that a single
universal adversary can hide different host vehicles from state-of-the-art
multi-modal detectors. Our experiments demonstrate that successful attacks are
primarily caused by easily corrupted image features. Furthermore, we find that
in modern sensor fusion methods which project image features into 3D,
adversarial attacks can exploit the projection process to generate false
positives across distant regions in 3D. Towards more robust multi-modal
perception systems, we show that adversarial training with feature denoising
can boost robustness to such attacks significantly. However, we find that
standard adversarial defenses still struggle to prevent false positives which
are also caused by inaccurate associations between 3D LiDAR points and 2D
pixels. |
---|---|
DOI: | 10.48550/arxiv.2101.06784 |