occAssess: An R package for assessing potential biases in species occurrence data

Species occurrence records from a variety of sources are increasingly aggregated into heterogeneous databases and made available to ecologists for immediate analytical use. However, these data are typically biased, i.e. they are not a probability sample of the target population of interest, meaning...

Full description

Saved in:
Bibliographic Details
Published in:Ecology and evolution Vol. 11; no. 22; pp. 16177 - 16187
Main Authors: Boyd, Robin J., Powney, Gary D., Carvell, Claire, Pescott, Oliver L.
Format: Journal Article
Language:English
Published: England John Wiley & Sons, Inc 01-11-2021
John Wiley and Sons Inc
Wiley
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Species occurrence records from a variety of sources are increasingly aggregated into heterogeneous databases and made available to ecologists for immediate analytical use. However, these data are typically biased, i.e. they are not a probability sample of the target population of interest, meaning that the information they provide may not be an accurate reflection of reality. It is therefore crucial that species occurrence data are properly scrutinised before they are used for research. In this article, we introduce occAssess, an R package that enables straightforward screening of species occurrence data for potential biases. The package contains a number of discrete functions, each of which returns a measure of the potential for bias in one or more of the taxonomic, temporal, spatial, and environmental dimensions. Users can opt to provide a set of time periods into which the data will be split; in this case separate outputs will be provided for each period, making the package particularly useful for assessing the suitability of a dataset for estimating temporal trends in species' distributions. The outputs are provided visually (as ggplot2 objects) and do not include a formal recommendation as to whether data are of sufficient quality for any given inferential use. Instead, they should be used as ancillary information and viewed in the context of the question that is being asked, and the methods that are being used to answer it. We demonstrate the utility of occAssess by applying it to data on two key pollinator taxa in South America: leaf‐nosed bats (Phyllostomidae) and hoverflies (Syrphidae). In this worked example, we briefly assess the degree to which various aspects of data coverage appear to have changed over time. We then discuss additional applications of the package, highlight its limitations, and point to future development opportunities. With the advent of online data aggregators and the digitization of historic records, ecologists now have access to huge quantities of species occurrence records. However, these data are typically biased – that is, they are not representative of the target populations of interest – which can lead to spurious inferences about species' distributions and how they have changed over time. In this paper, we present occAssess, an R package that enables straightforward screening of species occurrence data for biases, thereby helping researchers to avoid reaching biased conclusions.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2045-7758
2045-7758
DOI:10.1002/ece3.8299