Assessing transmission attribution risk from simulated sequencing data in HIV molecular epidemiology
HIV molecular epidemiology (ME) is the analysis of sequence data together with individual-level clinical, demographic, and behavioral data to understand HIV epidemiology. The use of ME has raised concerns regarding identification of the putative source in direct transmission events. This could resul...
Saved in:
Published in: | AIDS (London) Vol. 38; no. 6; pp. 865 - 873 |
---|---|
Main Authors: | , , , |
Format: | Journal Article |
Language: | English |
Published: |
England
Lippincott Williams & Wilkins
01-05-2024
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | HIV molecular epidemiology (ME) is the analysis of sequence data together with individual-level clinical, demographic, and behavioral data to understand HIV epidemiology. The use of ME has raised concerns regarding identification of the putative source in direct transmission events. This could result in harm ranging from stigma to criminal prosecution in some jurisdictions. Here we assessed the risks of ME using simulated HIV genetic sequencing data.
We simulated social networks of men-who-have-sex-with-men, calibrating the simulations to data from San Diego. We used these networks to simulate consensus and next-generation sequence (NGS) data to evaluate the risks of identifying direct transmissions using different HIV sequence lengths, and population sampling depths. To identify the source of transmissions, we calculated infector probability and used phyloscanner software for the analysis of consensus and NGS data, respectively.
Consensus sequence analyses showed that the risk of correctly inferring the source (direct transmission) within identified transmission pairs was very small and independent of sampling depth. Alternatively, NGS analyses showed that identification of the source of a transmission was very accurate, but only for 6.5% of inferred pairs. False positive transmissions were also observed, where one or more unobserved intermediaries were present when compared to the true network.
Source attribution using consensus sequences rarely infers direct transmission pairs with high confidence but is still useful for population studies. In contrast, source attribution using NGS data was much more accurate in identifying direct transmission pairs, but for only a small percentage of transmission pairs analyzed. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 0269-9370 1473-5571 1473-5571 |
DOI: | 10.1097/QAD.0000000000003820 |