False discovery rate control in genome-wide association studies with population structure

We present a comprehensive statistical framework to analyze data from genome-wide association studies of polygenic traits, producing interpretable findings while controlling the false discovery rate. In contrast with standard approaches, our method can leverage sophisticated multivariate algorithms...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings of the National Academy of Sciences - PNAS Vol. 118; no. 40; pp. 1 - 12
Main Authors: Sesia, Matteo, Bates, Stephen, Candès, Emmanuel, Marchini, Jonathan, Sabatti, Chiara
Format: Journal Article
Language:English
Published: United States National Academy of Sciences 05-10-2021
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We present a comprehensive statistical framework to analyze data from genome-wide association studies of polygenic traits, producing interpretable findings while controlling the false discovery rate. In contrast with standard approaches, our method can leverage sophisticated multivariate algorithms but makes no parametric assumptions about the unknown relation between genotypes and phenotype. Instead, we recognize that genotypes can be considered as a random sample from an appropriate model, encapsulating our knowledge of genetic inheritance and human populations. This allows the generation of imperfect copies (knockoffs) of these variables that serve as ideal negative controls, correcting for linkage disequilibrium and accounting for unknown population structure, which may be due to diverse ancestries or familial relatedness. The validity and effectiveness of our method are demonstrated by extensive simulations and by applications to the UK Biobank data. These analyses confirm our method is powerful relative to state-of-the-art alternatives, while comparisons with other studies validate most of our discoveries. Finally, fast software is made available for researchers to analyze Biobank-scale datasets.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Contributed by Emmanuel Candès, July 20, 2021 (sent for review March 27, 2021; reviewed by Dan Nicolae and Saharon Rosset)
Reviewers: D.N., University of Chicago; and S.R., Tel Aviv University.
Author contributions: M.S., S.B., E.C., J.M., and C.S. designed research; M.S., S.B., E.C., J.M., and C.S. performed research; M.S. and S.B. contributed new reagents/analytic tools; M.S. analyzed data; and M.S., S.B., E.C., J.M., and C.S. wrote the paper.
ISSN:0027-8424
1091-6490
1091-6490
DOI:10.1073/pnas.2105841118