Methods for accurate quantification of LTR-retrotransposon copy number using short-read sequence data: a case study in Sorghum

Transposable elements (TEs) are ubiquitous in eukaryotic genomes and their mobility impacts genome structure and function in myriad ways. Because of their abundance, activity, and repetitive nature, the characterization and analysis of TEs remain challenging, particularly from short-read sequencing...

Full description

Saved in:
Bibliographic Details
Published in:Molecular genetics and genomics : MGG Vol. 291; no. 5; pp. 1871 - 1883
Main Authors: Ramachandran, Dhanushya, Hawkins, Jennifer S.
Format: Journal Article
Language:English
Published: Berlin/Heidelberg Springer Berlin Heidelberg 01-10-2016
Springer Nature B.V
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Transposable elements (TEs) are ubiquitous in eukaryotic genomes and their mobility impacts genome structure and function in myriad ways. Because of their abundance, activity, and repetitive nature, the characterization and analysis of TEs remain challenging, particularly from short-read sequencing projects. To overcome this difficulty, we have developed a method that estimates TE copy number from short-read sequences. To test the accuracy of our method, we first performed an in silico analysis of the reference Sorghum bicolor genome, using both reference-based and de novo approaches. The resulting TE copy number estimates were strikingly similar to the annotated numbers. We then tested our method on real short-read data by estimating TE copy numbers in several accessions of S. bicolor and its close relative S. propinquum . Both methods effectively identify and rank similar TE families from highest to lowest abundance. We found that de novo characterization was effective at capturing qualitative variation, but underestimated the abundance of some TE families, specifically families of more ancient origin. Also, interspecific reference-based mapping of S. propinquum reads to the S. bicolor database failed to fully describe TE content in S. propinquum , indicative of recent TE activity leading to changes in the respective repetitive landscapes over very short evolutionary timescales. We conclude that reference-based analyses are best suited for within-species comparisons, while de novo approaches are more reliable for evolutionarily distant comparisons.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1617-4615
1617-4623
DOI:10.1007/s00438-016-1225-9