A clustering model for identification of time course gene expression patterns

Identification of gene expression patterns when studying complex and dynamic biological processes such as gene regulatory functions is critical. Gene expression is a continuous biological phenomenon and can be represented by a continuous function (curve). Each gene behaving in such a continuous func...

Full description

Saved in:
Bibliographic Details
Published in:2016 1st International Conference on Biomedical Engineering (IBIOMED) pp. 1 - 6
Main Authors: Ochieng, Peter Juma, Tarigan, Sri Ita, Didik, Hendrik
Format: Conference Proceeding
Language:English
Published: IEEE 01-10-2016
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Identification of gene expression patterns when studying complex and dynamic biological processes such as gene regulatory functions is critical. Gene expression is a continuous biological phenomenon and can be represented by a continuous function (curve). Each gene behaving in such a continuous functions often shares similar functional forms. However, patterns such as numbers, shape, and the identities of those genes sharing similar functional forms remain unknown. To identify such functional forms we introduce a clustering model for identification of time course gene expression patterns. The method utilizes an S-spline approach to model the functional curves and a penalized log-likelihood approach to fit the model. In addition, a rejection-controlled EM algorithm is designed minimizes the error and computational cost during mean curve estimation. Furthermore, the method utilizes general crossvalidation to select smoothing parameters and further measure the clustering uncertainty using the Bayesian information criterion. The interest of the method is illustrated by its application to D. melanogaster life cycle datasets. Simulation results indicated our method accurately estimates mean expression curve to true functional forms by assigning the gene to cluster, predicting mean curve and providing 95% associated confidence bands for each cluster. Based on Gene Ontology term description, the estimated mean curve in each cluster reflects true gene functional annotations with biologically meaningful gene expression patterns. Finally, comparative clustering performance indicates our method to outperform Fuzzy-cMeans and K-Means by misclassification rate of 0.1289 and overall success rate of 98.71%.
AbstractList Identification of gene expression patterns when studying complex and dynamic biological processes such as gene regulatory functions is critical. Gene expression is a continuous biological phenomenon and can be represented by a continuous function (curve). Each gene behaving in such a continuous functions often shares similar functional forms. However, patterns such as numbers, shape, and the identities of those genes sharing similar functional forms remain unknown. To identify such functional forms we introduce a clustering model for identification of time course gene expression patterns. The method utilizes an S-spline approach to model the functional curves and a penalized log-likelihood approach to fit the model. In addition, a rejection-controlled EM algorithm is designed minimizes the error and computational cost during mean curve estimation. Furthermore, the method utilizes general crossvalidation to select smoothing parameters and further measure the clustering uncertainty using the Bayesian information criterion. The interest of the method is illustrated by its application to D. melanogaster life cycle datasets. Simulation results indicated our method accurately estimates mean expression curve to true functional forms by assigning the gene to cluster, predicting mean curve and providing 95% associated confidence bands for each cluster. Based on Gene Ontology term description, the estimated mean curve in each cluster reflects true gene functional annotations with biologically meaningful gene expression patterns. Finally, comparative clustering performance indicates our method to outperform Fuzzy-cMeans and K-Means by misclassification rate of 0.1289 and overall success rate of 98.71%.
Author Ochieng, Peter Juma
Tarigan, Sri Ita
Didik, Hendrik
Author_xml – sequence: 1
  givenname: Peter Juma
  surname: Ochieng
  fullname: Ochieng, Peter Juma
  email: peter26juma@gmail.com
  organization: Dept. of Comput. Sci., IPB Univ., Bogor, Indonesia
– sequence: 2
  givenname: Sri Ita
  surname: Tarigan
  fullname: Tarigan, Sri Ita
  email: pendasri@gmail.com
  organization: Dept. of Entomology, IPB Univ., Bogor, Indonesia
– sequence: 3
  givenname: Hendrik
  surname: Didik
  fullname: Didik, Hendrik
  email: drikdoank@gmail.com
  organization: Dept. of Comput. Sci., IPB Univ., Bogor, Indonesia
BookMark eNotj8tOwzAQRY0EErT0C2DhH0gYJ85jlqUUiNSqm-4rx5lURokd2a4Ef08Q3dyzuNKRzoLdWmeJsWcBqRCAL81rc9hv39IMRJlWdYm1wBu2EAUgSCGz6p6tQvgCAIFlLfL6ge3XXA-XEMkbe-aj62jgvfPcdGSj6Y1W0TjLXc-jGYlrd_GB-JkscfqePIXwd08qzgYbHtldr4ZAqyuX7Pi-PW4-k93ho9msd4lBiElVkBSoFbYaUKoM2xJ1japQUlCrSwVtTqClzuapijmmldAVvW7rDjut8yV7-tcaIjpN3ozK_5yuwfkvQO1QIQ
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/IBIOMED.2016.7869819
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library Online
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library Online
  url: http://ieeexplore.ieee.org/Xplore/DynWel.jsp
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 1509041427
9781509041428
EndPage 6
ExternalDocumentID 7869819
Genre orig-research
GroupedDBID 6IE
6IF
6IK
6IL
6IN
AAJGR
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
IEGSK
OCL
RIE
RIL
ID FETCH-LOGICAL-i90t-75e419ca9bc094a29b69c89a5a41ebc6a0b3e0c4c20c475016b40d5fcb8d9dcc3
IEDL.DBID RIE
IngestDate Thu Jun 29 18:37:48 EDT 2023
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i90t-75e419ca9bc094a29b69c89a5a41ebc6a0b3e0c4c20c475016b40d5fcb8d9dcc3
PageCount 6
ParticipantIDs ieee_primary_7869819
PublicationCentury 2000
PublicationDate 2016-Oct.
PublicationDateYYYYMMDD 2016-10-01
PublicationDate_xml – month: 10
  year: 2016
  text: 2016-Oct.
PublicationDecade 2010
PublicationTitle 2016 1st International Conference on Biomedical Engineering (IBIOMED)
PublicationTitleAbbrev IBIOMED
PublicationYear 2016
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0001968138
Score 1.6600026
Snippet Identification of gene expression patterns when studying complex and dynamic biological processes such as gene regulatory functions is critical. Gene...
SourceID ieee
SourceType Publisher
StartPage 1
SubjectTerms Algorithm design and analysis
Biological system modeling
Biomedical engineering
Clustering algorithms
Computational modeling
Gene expression
RCEM
S-Spline
Title A clustering model for identification of time course gene expression patterns
URI https://ieeexplore.ieee.org/document/7869819
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwMhECa2J734aI3vcPDotuwusHBU28Yeqib24K2BYZo0MbuN7f5_gd1YTbx4IUBCIANkmOGbbwi5FSClEJYneQYi4UWRJ_4WQZLZlDvDUGqMrou34vldjcaBJufuOxYGESP4DAehGv_yXQV1cJUNCyW1ChyfnUKrJlZr50_RUqW5aqPjUqaH04fpy2w8CvAtOWiH_sqhElXI5PB_kx-R_i4Wj75-a5ljsoflCTn4QSPYI7N7Ch91YDzwTRpz21D_FqUr10KBovRptaQhkzyFKgA3qD85SP1CGiBsSdeRabPc9Ml8Mp4_PiVtmoRkpdk2KQTyVIPRFrypZjJtpQaljTA8RQvSMJsjAw6ZL_z7IJWWMyeWYJXTDiA_Jd2yKvGMUM5Zbr3FChlmPF8yhWC95Y2o0UjG7TnpBbks1g0RxqIVycXf3ZdkP4i-Qb5dke72s8Zr0tm4-iZu3Re8rZxV
link.rule.ids 310,311,782,786,791,792,798,27934,54767
linkProvider IEEE
linkToHtml http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwMhECZaD-rFR2t8y8Gj27K7wMJRbZs2ttXEHrw1y-w0aWJ2G9v9_wK7aTXx4oUACYHMQGCGb74h5F6AlEIYHsQRiIAnSRzYUwRBZEKepQylRu-6eE8mH6rbczQ5D5tYGET04DNsu6r_y88KKJ2rrJMoqZXj-NwTPJFJFa219ahoqcJY1fFxIdOd4dPwddzrOgCXbNeDf2VR8ZdI_-h_0x-T1jYaj75t7pkTsoP5KTn8QSTYJONHCp-l4zywTeqz21D7GqWLrAYDefnTYk5dLnkKhYNuULt3kNqFVFDYnC4912a-apFpvzd9HgR1ooRgodk6SATyUEOqDVhjLY20kRqUTkXKQzQgU2ZiZMAhsoV9IYTScJaJORiV6QwgPiONvMjxnFDOWWyszQoRRjyeM4VgrO2NqDGVjJsL0nRymS0rKoxZLZLLv7vvyP5gOh7NRsPJyxU5cGqocHDXpLH-KvGG7K6y8tar8RslWZ-m
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2016+1st+International+Conference+on+Biomedical+Engineering+%28IBIOMED%29&rft.atitle=A+clustering+model+for+identification+of+time+course+gene+expression+patterns&rft.au=Ochieng%2C+Peter+Juma&rft.au=Tarigan%2C+Sri+Ita&rft.au=Didik%2C+Hendrik&rft.date=2016-10-01&rft.pub=IEEE&rft.spage=1&rft.epage=6&rft_id=info:doi/10.1109%2FIBIOMED.2016.7869819&rft.externalDocID=7869819