Anacapa Toolkit: An environmental DNA toolkit for processing multilocus metabarcode datasets

Environmental DNA (eDNA) metabarcoding is a promising method to monitor species and community diversity that is rapid, affordable and non‐invasive. The longstanding needs of the eDNA community are modular informatics tools, comprehensive and customizable reference databases, flexibility across high‐...

Full description

Saved in:

Bibliographic Details
Published in:	Methods in ecology and evolution Vol. 10; no. 9; pp. 1469 - 1475
Main Authors:	Curd, Emily E., Gold, Zack, Kandlikar, Gaurav S., Gomer, Jesse, Ogden, Max, O'Connell, Taylor, Pipes, Lenore, Schweizer, Teia M., Rabichow, Laura, Lin, Meixi, Shi, Baochen, Barber, Paul H., Kraft, Nathan, Wayne, Robert, Meyer, Rachel S., Yu, Douglas
Format:	Journal Article
Language:	English
Published:	London John Wiley & Sons, Inc 01-09-2019
Subjects:	Bayesian analysis Bayesian methods Biodiversity Bioinformatics Chemical analysis Communities community ecology Confidence Deoxyribonucleic acid DNA Environmental DNA Loci metabarcoding Modules molecular methods multilocus metabarcoding processing Quality control Seawater sequence data Species diversity Statistical analysis Statistical methods Taxonomy Water analysis
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Environmental DNA (eDNA) metabarcoding is a promising method to monitor species and community diversity that is rapid, affordable and non‐invasive. The longstanding needs of the eDNA community are modular informatics tools, comprehensive and customizable reference databases, flexibility across high‐throughput sequencing platforms, fast multilocus metabarcode processing and accurate taxonomic assignment. Improvements in bioinformatics tools make addressing each of these demands within a single toolkit a reality. The new modular metabarcode sequence toolkit Anacapa (https://github.com/limey-bean/Anacapa/) addresses the above needs, allowing users to build comprehensive reference databases and assign taxonomy to raw multilocus metabarcode sequence data. A novel aspect of Anacapa is its database building module, “Creating Reference libraries Using eXisting tools” (CRUX), which generates comprehensive reference databases for specific user‐defined metabarcoding loci. The Quality Control and ASV Parsing module sorts and processes multiple metabarcoding loci and processes merged, unmerged and unpaired reads maximizing recovered diversity. DADA2 then detects amplicon sequence variants (ASVs) and the Anacapa Classifier module aligns these ASVs to CRUX‐generated reference databases using Bowtie2. Lastly, taxonomy is assigned to ASVs with confidence scores using a Bayesian Lowest Common Ancestor (BLCA) method. The Anacapa Toolkit also includes an r package, ranacapa, for automated results exploration through standard biodiversity statistical analysis. Benchmarking tests verify that the Anacapa Toolkit effectively and efficiently generates comprehensive reference databases that capture taxonomic diversity, and can assign taxonomy to both MiSeq and HiSeq‐length sequence data. We demonstrate the value of the Anacapa Toolkit in assigning taxonomy to seawater eDNA samples collected in southern California. The Anacapa Toolkit improves the functionality of eDNA and streamlines biodiversity assessment and management by generating metabarcode specific databases, processing multilocus data, retaining a larger proportion of sequencing reads and expanding non‐traditional eDNA targets. All the components of the Anacapa Toolkit are open and available in a virtual container to ease installation.
ISSN:	2041-210X 2041-210X
DOI:	10.1111/2041-210X.13214