A clustering model for identification of time course gene expression patterns
Identification of gene expression patterns when studying complex and dynamic biological processes such as gene regulatory functions is critical. Gene expression is a continuous biological phenomenon and can be represented by a continuous function (curve). Each gene behaving in such a continuous func...
Saved in:
Published in: | 2016 1st International Conference on Biomedical Engineering (IBIOMED) pp. 1 - 6 |
---|---|
Main Authors: | , , |
Format: | Conference Proceeding |
Language: | English |
Published: |
IEEE
01-10-2016
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Abstract | Identification of gene expression patterns when studying complex and dynamic biological processes such as gene regulatory functions is critical. Gene expression is a continuous biological phenomenon and can be represented by a continuous function (curve). Each gene behaving in such a continuous functions often shares similar functional forms. However, patterns such as numbers, shape, and the identities of those genes sharing similar functional forms remain unknown. To identify such functional forms we introduce a clustering model for identification of time course gene expression patterns. The method utilizes an S-spline approach to model the functional curves and a penalized log-likelihood approach to fit the model. In addition, a rejection-controlled EM algorithm is designed minimizes the error and computational cost during mean curve estimation. Furthermore, the method utilizes general crossvalidation to select smoothing parameters and further measure the clustering uncertainty using the Bayesian information criterion. The interest of the method is illustrated by its application to D. melanogaster life cycle datasets. Simulation results indicated our method accurately estimates mean expression curve to true functional forms by assigning the gene to cluster, predicting mean curve and providing 95% associated confidence bands for each cluster. Based on Gene Ontology term description, the estimated mean curve in each cluster reflects true gene functional annotations with biologically meaningful gene expression patterns. Finally, comparative clustering performance indicates our method to outperform Fuzzy-cMeans and K-Means by misclassification rate of 0.1289 and overall success rate of 98.71%. |
---|---|
AbstractList | Identification of gene expression patterns when studying complex and dynamic biological processes such as gene regulatory functions is critical. Gene expression is a continuous biological phenomenon and can be represented by a continuous function (curve). Each gene behaving in such a continuous functions often shares similar functional forms. However, patterns such as numbers, shape, and the identities of those genes sharing similar functional forms remain unknown. To identify such functional forms we introduce a clustering model for identification of time course gene expression patterns. The method utilizes an S-spline approach to model the functional curves and a penalized log-likelihood approach to fit the model. In addition, a rejection-controlled EM algorithm is designed minimizes the error and computational cost during mean curve estimation. Furthermore, the method utilizes general crossvalidation to select smoothing parameters and further measure the clustering uncertainty using the Bayesian information criterion. The interest of the method is illustrated by its application to D. melanogaster life cycle datasets. Simulation results indicated our method accurately estimates mean expression curve to true functional forms by assigning the gene to cluster, predicting mean curve and providing 95% associated confidence bands for each cluster. Based on Gene Ontology term description, the estimated mean curve in each cluster reflects true gene functional annotations with biologically meaningful gene expression patterns. Finally, comparative clustering performance indicates our method to outperform Fuzzy-cMeans and K-Means by misclassification rate of 0.1289 and overall success rate of 98.71%. |
Author | Ochieng, Peter Juma Tarigan, Sri Ita Didik, Hendrik |
Author_xml | – sequence: 1 givenname: Peter Juma surname: Ochieng fullname: Ochieng, Peter Juma email: peter26juma@gmail.com organization: Dept. of Comput. Sci., IPB Univ., Bogor, Indonesia – sequence: 2 givenname: Sri Ita surname: Tarigan fullname: Tarigan, Sri Ita email: pendasri@gmail.com organization: Dept. of Entomology, IPB Univ., Bogor, Indonesia – sequence: 3 givenname: Hendrik surname: Didik fullname: Didik, Hendrik email: drikdoank@gmail.com organization: Dept. of Comput. Sci., IPB Univ., Bogor, Indonesia |
BookMark | eNotj8tOwzAQRY0EErT0C2DhH0gYJ85jlqUUiNSqm-4rx5lURokd2a4Ef08Q3dyzuNKRzoLdWmeJsWcBqRCAL81rc9hv39IMRJlWdYm1wBu2EAUgSCGz6p6tQvgCAIFlLfL6ge3XXA-XEMkbe-aj62jgvfPcdGSj6Y1W0TjLXc-jGYlrd_GB-JkscfqePIXwd08qzgYbHtldr4ZAqyuX7Pi-PW4-k93ho9msd4lBiElVkBSoFbYaUKoM2xJ1japQUlCrSwVtTqClzuapijmmldAVvW7rDjut8yV7-tcaIjpN3ozK_5yuwfkvQO1QIQ |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/IBIOMED.2016.7869819 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library Online IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library Online url: http://ieeexplore.ieee.org/Xplore/DynWel.jsp sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISBN | 1509041427 9781509041428 |
EndPage | 6 |
ExternalDocumentID | 7869819 |
Genre | orig-research |
GroupedDBID | 6IE 6IF 6IK 6IL 6IN AAJGR ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK IEGSK OCL RIE RIL |
ID | FETCH-LOGICAL-i90t-75e419ca9bc094a29b69c89a5a41ebc6a0b3e0c4c20c475016b40d5fcb8d9dcc3 |
IEDL.DBID | RIE |
IngestDate | Thu Jun 29 18:37:48 EDT 2023 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i90t-75e419ca9bc094a29b69c89a5a41ebc6a0b3e0c4c20c475016b40d5fcb8d9dcc3 |
PageCount | 6 |
ParticipantIDs | ieee_primary_7869819 |
PublicationCentury | 2000 |
PublicationDate | 2016-Oct. |
PublicationDateYYYYMMDD | 2016-10-01 |
PublicationDate_xml | – month: 10 year: 2016 text: 2016-Oct. |
PublicationDecade | 2010 |
PublicationTitle | 2016 1st International Conference on Biomedical Engineering (IBIOMED) |
PublicationTitleAbbrev | IBIOMED |
PublicationYear | 2016 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0001968138 |
Score | 1.6600026 |
Snippet | Identification of gene expression patterns when studying complex and dynamic biological processes such as gene regulatory functions is critical. Gene... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 1 |
SubjectTerms | Algorithm design and analysis Biological system modeling Biomedical engineering Clustering algorithms Computational modeling Gene expression RCEM S-Spline |
Title | A clustering model for identification of time course gene expression patterns |
URI | https://ieeexplore.ieee.org/document/7869819 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwMhECa2J734aI3vcPDotuwusHBU28Yeqib24K2BYZo0MbuN7f5_gd1YTbx4IUBCIANkmOGbbwi5FSClEJYneQYi4UWRJ_4WQZLZlDvDUGqMrou34vldjcaBJufuOxYGESP4DAehGv_yXQV1cJUNCyW1ChyfnUKrJlZr50_RUqW5aqPjUqaH04fpy2w8CvAtOWiH_sqhElXI5PB_kx-R_i4Wj75-a5ljsoflCTn4QSPYI7N7Ch91YDzwTRpz21D_FqUr10KBovRptaQhkzyFKgA3qD85SP1CGiBsSdeRabPc9Ml8Mp4_PiVtmoRkpdk2KQTyVIPRFrypZjJtpQaljTA8RQvSMJsjAw6ZL_z7IJWWMyeWYJXTDiA_Jd2yKvGMUM5Zbr3FChlmPF8yhWC95Y2o0UjG7TnpBbks1g0RxqIVycXf3ZdkP4i-Qb5dke72s8Zr0tm4-iZu3Re8rZxV |
link.rule.ids | 310,311,782,786,791,792,798,27934,54767 |
linkProvider | IEEE |
linkToHtml | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwMhECZaD-rFR2t8y8Gj27K7wMJRbZs2ttXEHrw1y-w0aWJ2G9v9_wK7aTXx4oUACYHMQGCGb74h5F6AlEIYHsQRiIAnSRzYUwRBZEKepQylRu-6eE8mH6rbczQ5D5tYGET04DNsu6r_y88KKJ2rrJMoqZXj-NwTPJFJFa219ahoqcJY1fFxIdOd4dPwddzrOgCXbNeDf2VR8ZdI_-h_0x-T1jYaj75t7pkTsoP5KTn8QSTYJONHCp-l4zywTeqz21D7GqWLrAYDefnTYk5dLnkKhYNuULt3kNqFVFDYnC4912a-apFpvzd9HgR1ooRgodk6SATyUEOqDVhjLY20kRqUTkXKQzQgU2ZiZMAhsoV9IYTScJaJORiV6QwgPiONvMjxnFDOWWyszQoRRjyeM4VgrO2NqDGVjJsL0nRymS0rKoxZLZLLv7vvyP5gOh7NRsPJyxU5cGqocHDXpLH-KvGG7K6y8tar8RslWZ-m |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2016+1st+International+Conference+on+Biomedical+Engineering+%28IBIOMED%29&rft.atitle=A+clustering+model+for+identification+of+time+course+gene+expression+patterns&rft.au=Ochieng%2C+Peter+Juma&rft.au=Tarigan%2C+Sri+Ita&rft.au=Didik%2C+Hendrik&rft.date=2016-10-01&rft.pub=IEEE&rft.spage=1&rft.epage=6&rft_id=info:doi/10.1109%2FIBIOMED.2016.7869819&rft.externalDocID=7869819 |