Worst-Case and Practical Speedups of RNA Secondary Structure Prediction Algorithms Using the Four-Russians Method
Non-coding RNA (ncRNA) affect many aspects of gene expression; regulation of epigenetic processes, transcription, splicing, and translation. It has been observed that in eukaryotic genomes the ncRNA function is seen more clearly from structure. While there have been advances in methods that provide...
Saved in:
Main Author: | |
---|---|
Format: | Dissertation |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Non-coding RNA (ncRNA) affect many aspects of gene expression; regulation of epigenetic processes, transcription, splicing, and translation. It has been observed that in eukaryotic genomes the ncRNA function is seen more clearly from structure. While there have been advances in methods that provide experimental structure, the need for computational prediction grows as the gap between sequence availability and structure widens. Since RNA structure is a hierarchical process in which tertiary structure folds on top of thermodynamically optimal (or close to optimal) secondary structure, secondary structure is a key component of structure prediction. The problem of computationally predicting the secondary structure (or folding) of RNA molecules was first introduced more than thirty years ago. Since then much work has been published on improving the computation for the set of dynamic programs that predict RNA structure. Yet, the analysis presented in this thesis is the first to achieve both a practical and an asymptotic speedup in computation. The application of the Four-Russians speedup in this thesis reduces the computation time by O(log n) factor, for a sequence of size n, while retaining the original algorithm's dynamic programming solution matrix. As a result, the contribution of the speedup lies not only in its stand-alone practicality but also in its ability to be implemented alongside heuristic speedups, leading to even greater reductions in time. Through the analysis presented in this thesis that features encoding of a subset of all possible optimal structures and interleaving preprocessing with computation, an even greater speedup of O(log2n) becomes possible. This thesis presents the combination of the O(log2n) speedup method with the heuristic speedup created by Sparsification. The combination of Sparsification and Four-Russians methods achieves a greater speedup than either method alone. While this thesis focuses on---pseudoknot free RNA secondary structure prediction, comparative secondary structure prediction, and structure prediction which allows for pseudoknots---the framework presented here could be applied to other secondary structure prediction formulations. |
---|---|
Bibliography: | Adviser: Dan Gusfield. Source: Dissertation Abstracts International, Volume: 76-07(E), Section: B. Computer Science. |
ISBN: | 9781321608540 1321608543 |