T59. MARKER MATCH: A PROXIMITY BASED PROBE-MATCHING ALGORITHM FOR JOINT ANALYSIS OF CNVS FROM DIFFERENT GENOTYPING ARRAYS AND SUBSEQUENT CNV ASSOCIATION STUDY OF TOURETTE SYNDROME

Copy-number variants (CNVs) are structural mutations in the genome resulting from deletions or duplications of large segments of DNA and can affect a wide range of functional units, from parts of a gene to numerous genes in their entirety. Like SNPs, CNVs certain CNVs have been associated with susce...

Full description

Saved in:
Bibliographic Details
Published in:European neuropsychopharmacology Vol. 75; p. S193
Main Authors: Ivankovic, Franjo, Yu, Dongmei, Domenech, Laura, Zhan, Lingyu, Ophoff, Roel, Scharf, Jeremiah, Mathews, Carol
Format: Journal Article
Language:English
Published: Elsevier B.V 01-10-2023
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Copy-number variants (CNVs) are structural mutations in the genome resulting from deletions or duplications of large segments of DNA and can affect a wide range of functional units, from parts of a gene to numerous genes in their entirety. Like SNPs, CNVs certain CNVs have been associated with susceptibility to neuropsychiatric diseases, including schizophrenia, autism, and Tourette syndrome (TS). There are several means of detecting CNVs. However, the most common high-throughput genome-wide approach is the utilization of Hidden Markov Models (HMM) to analyze intensity measurements from genotyping arrays to identify clusters of relatively stronger or weaker signals corresponding to duplications or deletions, respectively. Like with GWASes, the power of CNV analyses (CNVAs) is dependent on the sample sizes, given the small effect sizes of individual CNVs. Unlike GWASes, CNVAs cannot exploit linkage disequilibrium to impute unmeasured or poorly measured markers. This limits our ability to pool samples genotyped on different arrays for joint analyses. Previous studies have used an approach relying on only analyzing the intersection of markers from two arrays, which usually results in substantial information loss and drops in already suboptimal sensitivity given the sparsity of genotyping arrays. We derived a positional probe-matching algorithm that reduces information loss and allows joint analyses of samples genotyped on distinct arrays. We also use this algorithm to analyze the CNVs associated with TS. We analyzed 1,421 TS trios from TAAICG and 1,069 ASD quads from SSC. MarkerMatch takes the smallest array and uses it as a reference to match the markers from the larger array. It performs three rounds of matching: (1) markers with the same identifiers, (2) markers with the same positions, (3) the nearest marker within a predetermined maximum allowable distance (here 10,000bp). Subsequently, CNVs are then called using PennCNV and QuantiSNP. We compared SSC Omni2.5 reduced set (the markers kept after MarkerMatch) to the full Omni2.5 array to ensure the quality of CNV calls. We also compared the reduced Omni2.5 set to GSA set of TS samples. GSA set (618,406 nuclear probes) was 4 times smaller than Omni2.5 set (2,435,200 nuclear probes). Identifier-based intersection only retained 21%, and position-based intersection only retained 46% of GSA probes. MarkerMatch algorithm successfully yielded the retention of over 99% of GSA probes. Comparison of Omni2.5 to reduced Omni2.5 set yielded sensitivity of 0.279, miss rate of 0.721, precision of 0.939, and false discovery rate of 0.061. Analysis of reduced Omni2.5 set vs. GSA set has resulted in the validation of previously discovered NRXN1 deletion and CNTN6 duplication associated with TS risk. MarkerMatch improves the retention of information by increasing the rate of retained probes (from the smallest considered array) to 99%. This is a drastic improvement to previously exercised methods of array harmonization by strict intersections based on probe identifier or probe position. The resulting decrease in sensitivity and increase in miss rate for the reduced Omni2.5 set can be attributed to the actual reduction in analyzed probes. Preliminary association tests show consistent associations with previous CNVAs and replication of NRXN1 and CNTN6 findings. Additional testing of MarkerMatch, exploration of its performance, and global burden testing in TS are currently underway.
ISSN:0924-977X
1873-7862
DOI:10.1016/j.euroneuro.2023.08.344