MEDUSA: A Pipeline for Sensitive Taxonomic Classification and Flexible Functional Annotation of Metagenomic Shotgun Sequences

Metagenomic studies unravel details about the taxonomic composition and the functions performed by microbial communities. As a complete metagenomic analysis requires different tools for different purposes, the selection and setup of these tools remain challenging. Furthermore, the chosen toolset wil...

Full description

Saved in:
Bibliographic Details
Published in:Frontiers in genetics Vol. 13; p. 814437
Main Authors: Morais, Diego A A, Cavalcante, João V F, Monteiro, Shênia S, Pasquali, Matheus A B, Dalmolin, Rodrigo J S
Format: Journal Article
Language:English
Published: Switzerland Frontiers Media S.A 07-03-2022
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Metagenomic studies unravel details about the taxonomic composition and the functions performed by microbial communities. As a complete metagenomic analysis requires different tools for different purposes, the selection and setup of these tools remain challenging. Furthermore, the chosen toolset will affect the accuracy, the formatting, and the functional identifiers reported in the results, impacting the results interpretation and the biological answer obtained. Thus, we surveyed state-of-the-art tools available in the literature, created simulated datasets, and performed benchmarks to design a sensitive and flexible metagenomic analysis pipeline. Here we present MEDUSA, an efficient pipeline to conduct comprehensive metagenomic analyses. It performs preprocessing, assembly, alignment, taxonomic classification, and functional annotation on shotgun data, supporting user-built dictionaries to transfer annotations to any functional identifier. MEDUSA includes several tools, as fastp, Bowtie2, DIAMOND, Kaiju, MEGAHIT, and a novel tool implemented in Python to transfer annotations to BLAST/DIAMOND alignment results. These tools are installed via Conda, and the workflow is managed by Snakemake, easing the setup and execution. Compared with MEGAN 6 Community Edition, MEDUSA correctly identifies more species, especially the less abundant, and is more suited for functional analysis using Gene Ontology identifiers.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Reviewed by: Hassan Ghazal, National Center for Scientific and Technical Research (CNRST), Morocco
This article was submitted to Computational Genomics, a section of the journal Frontiers in Genetics
Edited by: Guanglin Li, Shaanxi Normal University, China
Fotis Psomopoulos, Centre for Research and Technology Hellas (CERTH), Greece
ISSN:1664-8021
1664-8021
DOI:10.3389/fgene.2022.814437