Danish Fungi 2020 -- Not Just Another Image Recognition Dataset
We introduce a novel fine-grained dataset and benchmark, the Danish Fungi 2020 (DF20). The dataset, constructed from observations submitted to the Atlas of Danish Fungi, is unique in its taxonomy-accurate class labels, small number of errors, highly unbalanced long-tailed class distribution, rich ob...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
20-08-2021
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | We introduce a novel fine-grained dataset and benchmark, the Danish Fungi
2020 (DF20). The dataset, constructed from observations submitted to the Atlas
of Danish Fungi, is unique in its taxonomy-accurate class labels, small number
of errors, highly unbalanced long-tailed class distribution, rich observation
metadata, and well-defined class hierarchy. DF20 has zero overlap with
ImageNet, allowing unbiased comparison of models fine-tuned from publicly
available ImageNet checkpoints. The proposed evaluation protocol enables
testing the ability to improve classification using metadata -- e.g. precise
geographic location, habitat, and substrate, facilitates classifier calibration
testing, and finally allows to study the impact of the device settings on the
classification performance. Experiments using Convolutional Neural Networks
(CNN) and the recent Vision Transformers (ViT) show that DF20 presents a
challenging task. Interestingly, ViT achieves results superior to CNN baselines
with 80.45% accuracy and 0.743 macro F1 score, reducing the CNN error by 9% and
12% respectively. A simple procedure for including metadata into the decision
process improves the classification accuracy by more than 2.95 percentage
points, reducing the error rate by 15%. The source code for all methods and
experiments is available at https://sites.google.com/view/danish-fungi-dataset. |
---|---|
DOI: | 10.48550/arxiv.2103.10107 |