Computational Prediction of c-MYC Binding and Action by Integration of Multiple Data Sources
c-MYC is an important proto-oncogene. Its actions are mediated by sequence specific binding of the c-MYC protein to genomic DNA. While many c-MYC recognition sites can be identified in c-MYC responsive genes, many others are associated with genes showing no c-MYC response. It is not yet known how th...
Saved in:
Published in: | Blood Vol. 108; no. 11; p. 4345 |
---|---|
Main Authors: | , , , , , , |
Format: | Journal Article |
Language: | English |
Published: |
Elsevier Inc
16-11-2006
|
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | c-MYC is an important proto-oncogene. Its actions are mediated by sequence specific binding of the c-MYC protein to genomic DNA. While many c-MYC recognition sites can be identified in c-MYC responsive genes, many others are associated with genes showing no c-MYC response. It is not yet known how the cell determines which of the many c-MYC recognition sites are biologically active and directly bind c-MYC protein to regulate gene expression. We have developed a computational model that predict c-MYC binding and functional activation as distinct processes. Our model integrates four types of evidence to predict functional c-MYC targets: genomic sequence, MYC binding, gene expression and gene function annotations. First, a Bayesian network classifier is used to predict c-MYC recognition sites likely to exhibit high occupancy binding in chromatin immunoprecipitation studies using several types of sequence information, including predicted DNA methylation using a computational model to estimate the likelihood of genomic DNA methylation. In the second step, the DNA binding probability of MYC is combined with the gene expression information from 9 independent microarray datasets in multiple tissues and the gene function annotations in Gene Ontology to predict the c-MYC targets. The prediction results were compared with the c-MYC targets in public MYC target database [www.myccancergene.org], which collected the c-MYC targets identified in biomedical literatures. In total, we predicted 599 likely c-MYC genes on human genome, of which 73 have been reported to be both bound and regulated by MYC, 83 are bound by MYC in vivo and another 93 are MYC regulated. The approach thus successfully identified many known c-MYC targets as well as suggesting many novel sites including many sites that are remote from the transcription start site. Our findings suggest that to identify c-MYC genomic targets, any study based on single high throughput dataset is likely to be insufficient. Using multiple gene expression datasets helps to improve the sensitivity and integration of different data sources helps to improve the specificity.
Summary of c-MYC Targets PredictionMicroarray DatasetData Source (Citation)TissuePredicted TargetsBinding&Regulation ReportedOnly Binding ReportedOnly Regulation Reported1PMID: 15778709B Cell4216160562PMID: 12086878Prostate Cancer4285665763PMID: 14722351Prostate Cancer5047134PMID: 15254046Prostate Cancer66198145PMID: 12747878Breast Cancer171356PMID: 11707567Lung Cancer2955142597PMID: 15820940CML81128PMID: 12704389ALL2224532469PMID: 11731795ALL / MLL / AML22616Total599738393 |
---|---|
ISSN: | 0006-4971 1528-0020 |
DOI: | 10.1182/blood.V108.11.4345.4345 |