Improving Object Detector Training on Synthetic Data by Starting With a Strong Baseline Methodology
Collecting and annotating real-world data for the development of object detection models is a time-consuming and expensive process. In the military domain in particular, data collection can also be dangerous or infeasible. Training models on synthetic data may provide a solution for cases where acce...
Saved in:
Main Authors: | , , , , , , , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
30-05-2024
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Collecting and annotating real-world data for the development of object
detection models is a time-consuming and expensive process. In the military
domain in particular, data collection can also be dangerous or infeasible.
Training models on synthetic data may provide a solution for cases where access
to real-world training data is restricted. However, bridging the reality gap
between synthetic and real data remains a challenge. Existing methods usually
build on top of baseline Convolutional Neural Network (CNN) models that have
been shown to perform well when trained on real data, but have limited ability
to perform well when trained on synthetic data. For example, some architectures
allow for fine-tuning with the expectation of large quantities of training data
and are prone to overfitting on synthetic data. Related work usually ignores
various best practices from object detection on real data, e.g. by training on
synthetic data from a single environment with relatively little variation. In
this paper we propose a methodology for improving the performance of a
pre-trained object detector when training on synthetic data. Our approach
focuses on extracting the salient information from synthetic data without
forgetting useful features learned from pre-training on real images. Based on
the state of the art, we incorporate data augmentation methods and a
Transformer backbone. Besides reaching relatively strong performance without
any specialized synthetic data transfer methods, we show that our methods
improve the state of the art on synthetic data trained object detection for the
RarePlanes and DGTA-VisDrone datasets, and reach near-perfect performance on an
in-house vehicle detection dataset. |
---|---|
DOI: | 10.48550/arxiv.2405.19822 |