A flexible copula‐based approach for the analysis of secondary phenotypes in ascertained samples
Data collected for a genome‐wide association study of a primary phenotype are often used for additional genome‐wide association analyses of secondary phenotypes. However, when the primary and secondary traits are dependent, naïve analyses of secondary phenotypes may induce spurious associations in n...
Saved in:
Published in: | Statistics in medicine Vol. 39; no. 5; pp. 517 - 543 |
---|---|
Main Authors: | , , , |
Format: | Journal Article |
Language: | English |
Published: |
England
Wiley Subscription Services, Inc
28-02-2020
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Data collected for a genome‐wide association study of a primary phenotype are often used for additional genome‐wide association analyses of secondary phenotypes. However, when the primary and secondary traits are dependent, naïve analyses of secondary phenotypes may induce spurious associations in non‐randomly ascertained samples.
Previously, retrospective likelihood‐based methods have been proposed to correct for sampling biases arising in secondary trait association analyses. However, most methods have been introduced to handle studies featuring a case‐control design based on a binary primary phenotype. As such, these methods are not directly applicable to more complicated study designs such as multiple‐trait studies, where the sampling mechanism also depends on the secondary phenotype, or extreme‐trait studies, where individuals with extreme primary phenotype values are selected. To accommodate these more complicated sampling mechanisms, only a few prospective likelihood approaches have been proposed. These approaches assume a normal distribution for the secondary phenotype (or the latent secondary phenotype) and a bivariate normal distribution for the primary‐secondary phenotype dependence.
In this paper, we propose a unified copula‐based approach to appropriately detect genetic variant/secondary phenotype association in the presence of selected samples. Primary phenotype is either binary or continuous and the secondary phenotype is continuous although not necessary normal. We use both prospective and retrospective likelihoods to account for the sampling mechanism and use a copula model to allow for potentially different dependence structures between the primary and secondary phenotypes. We demonstrate the effectiveness of our approach through simulation studies and by analyzing data from the Avon Longitudinal Study of Parents and Children cohort. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 0277-6715 1097-0258 |
DOI: | 10.1002/sim.8416 |