Statistical solutions for error and bias in global citizen science datasets

•Citizen-scientist (CS) datasets offer unique opportunities and challenges to the study of global conservation priorities.•Fortunately, issues of error and bias found in CS data are similar to those found in other large-scale databases.•As a consequence, statistical tools exist to handle many kinds...

Full description

Saved in:

Bibliographic Details
Published in:	Biological conservation Vol. 173; pp. 144 - 154
Main Authors:	Bird, Tomas J., Bates, Amanda E., Lefcheck, Jonathan S., Hill, Nicole A., Thomson, Russell J., Edgar, Graham J., Stuart-Smith, Rick D., Wotherspoon, Simon, Krkosek, Martin, Stuart-Smith, Jemina F., Pecl, Gretta T., Barrett, Neville, Frusher, Stewart
Format:	Journal Article
Language:	English
Published:	Kidlington Elsevier Ltd 01-05-2014 Elsevier
Subjects:	Additive models Animal and plant ecology Animal, plant and microbial ecology Applied ecology Bias Biodiversity Biological and medical sciences Conservation Conservation, protection and management of environment and wildlife Error analysis Errors Experimental design Fundamental and applied biological sciences. Psychology General aspects Linear models Mathematical models Parks, reserves, wildlife conservation. Endangered species: population survey and restocking Perception Reef life survey Sampling Scientists Species distribution models Statistical analysis Synecology Volunteer data Volunteer data Species distribution models Statistical analysis Experimental design Reef life survey Additive models Biodiversity Linear models Volunteer Error Additive model Linear model Statistical method Reef Spatial distribution Geographic distribution Sciences Species Distribution range Environmental protection
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	•Citizen-scientist (CS) datasets offer unique opportunities and challenges to the study of global conservation priorities.•Fortunately, issues of error and bias found in CS data are similar to those found in other large-scale databases.•As a consequence, statistical tools exist to handle many kinds of error and bias common to CS data.•We highlight some statistical approaches that are used in ecological contexts and are available in free software packages. Networks of citizen scientists (CS) have the potential to observe biodiversity and species distributions at global scales. Yet the adoption of such datasets in conservation science may be hindered by a perception that the data are of low quality. This perception likely stems from the propensity of data generated by CS to contain greater levels of variability (e.g., measurement error) or bias (e.g., spatio-temporal clustering) in comparison to data collected by scientists or instruments. Modern analytical approaches can account for many types of error and bias typical of CS datasets. It is possible to (1) describe how pseudo-replication in sampling influences the overall variability in response data using mixed-effects modeling, (2) integrate data to explicitly model the sampling process and account for bias using a hierarchical modeling framework, and (3) examine the relative influence of many different or related explanatory factors using machine learning tools. Information from these modeling approaches can be used to predict species distributions and to estimate biodiversity. Even so, achieving the full potential from CS projects requires meta-data describing the sampling process, reference data to allow for standardization, and insightful modeling suitable to the question of interest.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0006-3207 1873-2917
DOI:	10.1016/j.biocon.2013.07.037