Parallelisation of maximal patterns finding algorithm in biological sequences

The rapid increase of the biological data opens up new challenges for scientists to discover new methods to manage, analyses and understand them effectively. One of the methods in analysing these biological data is by looking at the maximal patterns that exists in the data. Discovering the relations...

Full description

Saved in:

Bibliographic Details
Published in:	2016 3rd International Conference on Computer and Information Sciences (ICCOINS) pp. 227 - 232
Main Authors:	Hussein, Ahmad MohdAziz, Rashid, Nuraini Abdul, Abdulah, Rosni
Format:	Conference Proceeding
Language:	English
Published:	IEEE 01-08-2016
Subjects:	Algorithm design and analysis Computers discovery motifs finding patterns finding sequence similarities Instruction sets Partitioning algorithms pattern matching Proteins Teiresias
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The rapid increase of the biological data opens up new challenges for scientists to discover new methods to manage, analyses and understand them effectively. One of the methods in analysing these biological data is by looking at the maximal patterns that exists in the data. Discovering the relationship among the biological sequences is based on the importance of the maximal patterns of these sequences. These maximal patterns can be used to build indexes a faster search. In this research, we used parallel methods to improve the speed of an existing maximal pattern finding algorithm, TEIRESIAS. There are two phases in the algorithm, which are the scanning and the convolution phases. The first phase detects short patterns in the biological data and the second phase combines the short patterns into longer patterns without sacrificing the meaning. The output will be maximal patterns. The first phase of the algorithm is very compute intensive. We improve the overall process of finding maximal patterns by decomposing the biological database and by distributing it to be input into to the TEIRESIAS algorithm. We applied the master-slave model and used OpenMP to implement the model. Our results show that the performance decreased when we used 8 threads. The results also show that there are 1.6 time and 2.0 times improvement in terms of the overall speed of the algorithm when we used two threads and four.
DOI:	10.1109/ICCOINS.2016.7783219