MultiGWAS: An integrative tool for Genome Wide Association Studies in tetraploid organisms

The genome‐wide association studies (GWASs) are essential to determine the genetic bases of either ecological or economic phenotypic variation across individuals within populations of the model and nonmodel organisms. For this research question, the GWAS replication testing different parameters and...

Full description

Saved in:
Bibliographic Details
Published in:Ecology and evolution Vol. 11; no. 12; pp. 7411 - 7426
Main Authors: Garreta, Luis, Cerón‐Souza, Ivania, Palacio, Manfred Ricardo, Reyes‐Herrera, Paula H.
Format: Journal Article
Language:English
Published: England John Wiley & Sons, Inc 01-06-2021
John Wiley and Sons Inc
Wiley
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The genome‐wide association studies (GWASs) are essential to determine the genetic bases of either ecological or economic phenotypic variation across individuals within populations of the model and nonmodel organisms. For this research question, the GWAS replication testing different parameters and models to validate the results' reproducibility is common. However, straightforward methodologies that manage both replication and tetraploid data are still missing. To solve this problem, we designed the MultiGWAS, a tool that does GWAS for diploid and tetraploid organisms by executing in parallel four software packages, two designed for polyploid data (GWASpoly and SHEsis) and two designed for diploid data (GAPIT and TASSEL). MultiGWAS has several advantages. It runs either in the command line or in a graphical interface; it manages different genotype formats, including VCF. Moreover, it allows control for population structure, relatedness, and several quality control checks on genotype data. Besides, MultiGWAS can test for additive and dominant gene action models, and, through a proprietary scoring function, select the best model to report its associations. Finally, it generates several reports that facilitate identifying false associations from both the significant and the best‐ranked association Single Nucleotide Polymorphisms (SNPs) among the four software packages. We tested MultiGWAS with public tetraploid potato data for tuber shape and several simulated data under both additive and dominant models. These tests demonstrated that MultiGWAS is better at detecting reliable associations than using each of the four software packages individually. Moreover, the parallel analysis of polyploid and diploid software that only offers MultiGWAS demonstrates its utility in understanding the best genetic model behind the SNP association in tetraploid organisms. Therefore, MultiGWAS probed to be an excellent alternative for wrapping GWAS replication in diploid and tetraploid organisms in a single analysis environment. The genome‐wide association studies (GWASs) are essential to determine the genetic bases of either ecological or economic phenotypic variation across individuals within populations of model and nonmodel organisms. Replication is a good practice to assess results, but straightforward methodologies that manage both replication and tetraploid data are still missing. To solve this problem, we designed the MultiGWAS, a tool that does GWAS for diploid and tetraploid organisms by executing in parallel four software packages, two for polyploid data (GWASpoly and SHEsis) and two for diploid data (GAPIT and TASSEL). MultiGWAS includes (1) the input and preprocessing of genomic data in different formats (including VCF files), (2) association analysis by running the GWAS software in parallel, (3) postprocessing and summarizing of their results, and (4) reporting using graphical and tabular views. MultiGWAS identifies both the highest scoring and shared associations between the four software packages, which helps users decide more intuitively on possible true or false associations. MultiGWAS can test for additive and dominant gene action models, and, through a proprietary scoring function, select the best model to report its associations. We tested MultiGWAS with public tetraploid potato data for tuber shape and several simulated data under both additive and dominant models. These tests demonstrated that MultiGWAS is better at detecting reliable associations than using each of the four software packages individually. ​
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2045-7758
2045-7758
DOI:10.1002/ece3.7572