Discriminative prediction of mammalian enhancers from DNA sequence
Accurately predicting regulatory sequences and enhancers in entire genomes is an important but difficult problem, especially in large vertebrate genomes. With the advent of ChIP-seq technology, experimental detection of genome-wide EP300/CREBBP bound regions provides a powerful platform to develop p...
Saved in:
Published in: | Genome research Vol. 21; no. 12; pp. 2167 - 2180 |
---|---|
Main Authors: | , , |
Format: | Journal Article |
Language: | English |
Published: |
United States
Cold Spring Harbor Laboratory Press
01-12-2011
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Abstract | Accurately predicting regulatory sequences and enhancers in entire genomes is an important but difficult problem, especially in large vertebrate genomes. With the advent of ChIP-seq technology, experimental detection of genome-wide EP300/CREBBP bound regions provides a powerful platform to develop predictive tools for regulatory sequences and to study their sequence properties. Here, we develop a support vector machine (SVM) framework which can accurately identify EP300-bound enhancers using only genomic sequence and an unbiased set of general sequence features. Moreover, we find that the predictive sequence features identified by the SVM classifier reveal biologically relevant sequence elements enriched in the enhancers, but we also identify other features that are significantly depleted in enhancers. The predictive sequence features are evolutionarily conserved and spatially clustered, providing further support of their functional significance. Although our SVM is trained on experimental data, we also predict novel enhancers and show that these putative enhancers are significantly enriched in both ChIP-seq signal and DNase I hypersensitivity signal in the mouse brain and are located near relevant genes. Finally, we present results of comparisons between other EP300/CREBBP data sets using our SVM and uncover sequence elements enriched and/or depleted in the different classes of enhancers. Many of these sequence features play a role in specifying tissue-specific or developmental-stage-specific enhancer activity, but our results indicate that some features operate in a general or tissue-independent manner. In addition to providing a high confidence list of enhancer targets for subsequent experimental investigation, these results contribute to our understanding of the general sequence structure of vertebrate enhancers. |
---|---|
AbstractList | Accurately predicting regulatory sequences and enhancers in entire genomes is an important but difficult problem, especially in large vertebrate genomes. With the advent of ChIP-seq technology, experimental detection of genome-wide EP300/CREBBP bound regions provides a powerful platform to develop predictive tools for regulatory sequences and to study their sequence properties. Here, we develop a support vector machine (SVM) framework which can accurately identify EP300-bound enhancers using only genomic sequence and an unbiased set of general sequence features. Moreover, we find that the predictive sequence features identified by the SVM classifier reveal biologically relevant sequence elements enriched in the enhancers, but we also identify other features that are significantly depleted in enhancers. The predictive sequence features are evolutionarily conserved and spatially clustered, providing further support of their functional significance. Although our SVM is trained on experimental data, we also predict novel enhancers and show that these putative enhancers are significantly enriched in both ChIP-seq signal and DNase I hypersensitivity signal in the mouse brain and are located near relevant genes. Finally, we present results of comparisons between other EP300/CREBBP data sets using our SVM and uncover sequence elements enriched and/or depleted in the different classes of enhancers. Many of these sequence features play a role in specifying tissue-specific or developmental-stage-specific enhancer activity, but our results indicate that some features operate in a general or tissue-independent manner. In addition to providing a high confidence list of enhancer targets for subsequent experimental investigation, these results contribute to our understanding of the general sequence structure of vertebrate enhancers. |
Author | Beer, Michael A Lee, Dongwon Karchin, Rachel |
AuthorAffiliation | 2 Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland 21218, USA 3 McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland 21205, USA 1 Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21205, USA |
AuthorAffiliation_xml | – name: 2 Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland 21218, USA – name: 1 Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21205, USA – name: 3 McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland 21205, USA |
Author_xml | – sequence: 1 givenname: Dongwon surname: Lee fullname: Lee, Dongwon organization: Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21205, USA – sequence: 2 givenname: Rachel surname: Karchin fullname: Karchin, Rachel – sequence: 3 givenname: Michael A surname: Beer fullname: Beer, Michael A |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/21875935$$D View this record in MEDLINE/PubMed |
BookMark | eNqFUblOxTAQtBCIu6RF6agC6_iI3SBxg4SggdoyzuZhlNgPOw-Jv8foAYKKao8ZjWZ3tshqiAEJ2aNwSCnQo1k6pA3VIMpIV8gmFVzXgku9WnpQqi4Q3SBbOb8AAONKrZONhqpWaCY2yem5zy750Qc7-Tes5gk77yYfQxX7arTjaAdvQ4Xh2QaHKVd9imN1fndSZXxdYNntkLXeDhl3v-o2eby8eDi7rm_vr27OTm5rxxWfauYUdJJLaFnDZeuE7q0svShw8YOohRCOAgqQDJnrNICV0FvWMcU7YNvkeKk7XzyN2DkMU7KDmRf3Nr2baL35iwT_bGbxzbCmaSmIInDwJZBisZ4nM5bjcRhswLjIRpffqaZR7H8mtFprLmVh1kumSzHnhP2PHwrmMyAzS2YZUBlp4e__PuKH_Z0I-wC-DIz7 |
CitedBy_id | crossref_primary_10_1093_nar_gkt519 crossref_primary_10_3390_ijms20071704 crossref_primary_10_1093_bioinformatics_btx234 crossref_primary_10_2174_1574893618666230508104341 crossref_primary_10_1002_btpr_2712 crossref_primary_10_1016_j_gpb_2018_05_003 crossref_primary_10_1093_bib_bbad156 crossref_primary_10_1093_bioinformatics_btw142 crossref_primary_10_1093_nar_gkw101 crossref_primary_10_1186_s12862_020_01723_3 crossref_primary_10_1101_gr_164178_113 crossref_primary_10_1038_srep23934 crossref_primary_10_1007_s10142_023_01040_0 crossref_primary_10_1007_s10462_022_10283_5 crossref_primary_10_1016_j_ab_2018_03_025 crossref_primary_10_1093_bioinformatics_btaa519 crossref_primary_10_1186_s41065_016_0012_2 crossref_primary_10_1038_s41588_018_0134_8 crossref_primary_10_1186_s12976_020_00122_x crossref_primary_10_1093_bib_bbv101 crossref_primary_10_1186_1752_0509_8_S5_S5 crossref_primary_10_1371_journal_pone_0035202 crossref_primary_10_1093_bioadv_vbad043 crossref_primary_10_1093_bioinformatics_btx105 crossref_primary_10_3389_fcimb_2023_1182567 crossref_primary_10_1186_gb_2013_14_7_r72 crossref_primary_10_1371_journal_pcbi_1003677 crossref_primary_10_1186_s13059_023_02955_4 crossref_primary_10_1371_journal_pcbi_1003711 crossref_primary_10_3390_genes12111689 crossref_primary_10_1109_TCBB_2021_3053608 crossref_primary_10_1038_s41588_021_00782_6 crossref_primary_10_1186_s12864_017_3934_9 crossref_primary_10_1016_j_ab_2021_114318 crossref_primary_10_1002_tpg2_20135 crossref_primary_10_1093_bioinformatics_btv208 crossref_primary_10_1016_j_knosys_2023_110492 crossref_primary_10_1007_s00521_020_04879_7 crossref_primary_10_1186_s12861_016_0106_0 crossref_primary_10_1186_s13104_021_05518_7 crossref_primary_10_1093_bioinformatics_btw203 crossref_primary_10_1093_bioinformatics_btab349 crossref_primary_10_1093_bib_bbad170 crossref_primary_10_1101_gr_199778_115 crossref_primary_10_1016_j_compbiomed_2024_108166 crossref_primary_10_1039_D0MO00031K crossref_primary_10_1186_s12864_018_5335_0 crossref_primary_10_1371_journal_pcbi_1005249 crossref_primary_10_1371_journal_pcbi_1005403 crossref_primary_10_3389_fmolb_2021_673363 crossref_primary_10_1016_j_ajhg_2019_02_008 crossref_primary_10_1101_gr_190603_115 crossref_primary_10_1186_s12915_023_01596_0 crossref_primary_10_1101_gr_244251_118 crossref_primary_10_1186_1756_8935_8_8 crossref_primary_10_1016_j_ab_2020_113995 crossref_primary_10_1186_s13072_017_0152_2 crossref_primary_10_1371_journal_pone_0047836 crossref_primary_10_1080_15476286_2021_1940697 crossref_primary_10_1371_journal_pone_0185570 crossref_primary_10_1093_bib_bbx067 crossref_primary_10_1109_TCBB_2018_2819660 crossref_primary_10_3389_fcell_2020_00741 crossref_primary_10_1038_s41598_018_33321_1 crossref_primary_10_1186_s12859_021_04143_2 crossref_primary_10_1007_s10529_014_1523_4 crossref_primary_10_1007_s00330_022_09130_6 crossref_primary_10_1093_bioinformatics_btw552 crossref_primary_10_1101_gr_139360_112 crossref_primary_10_1016_j_ymeth_2014_10_008 crossref_primary_10_1016_j_tig_2012_09_007 crossref_primary_10_1093_bioinformatics_btx480 crossref_primary_10_1038_s41576_019_0122_6 crossref_primary_10_1093_nar_gkt188 crossref_primary_10_1093_bib_bbaa053 crossref_primary_10_1101_gr_139717_112 crossref_primary_10_1038_s41598_018_34420_9 crossref_primary_10_1109_ACCESS_2021_3062291 crossref_primary_10_1371_journal_pcbi_1004271 crossref_primary_10_1038_s41467_020_14853_5 crossref_primary_10_1093_nargab_lqab095 crossref_primary_10_1016_j_ab_2020_113905 crossref_primary_10_1016_j_pbi_2015_01_005 crossref_primary_10_1073_pnas_2215328119 crossref_primary_10_3390_genes3040651 crossref_primary_10_1002_term_2158 crossref_primary_10_1016_j_pt_2014_02_008 crossref_primary_10_3390_molecules28052284 crossref_primary_10_1371_journal_pone_0140557 crossref_primary_10_1016_j_gpb_2019_04_006 crossref_primary_10_3390_cells8111332 crossref_primary_10_1016_j_ymeth_2013_03_021 crossref_primary_10_1093_bioinformatics_btx679 crossref_primary_10_1080_15476286_2024_2315384 crossref_primary_10_1093_nar_gkv458 crossref_primary_10_1371_journal_pone_0174052 crossref_primary_10_1101_gr_268599_120 crossref_primary_10_1186_s12870_019_1693_2 crossref_primary_10_1371_journal_pone_0274338 crossref_primary_10_1016_j_gde_2016_12_007 crossref_primary_10_1101_gr_159608_113 crossref_primary_10_1093_bioinformatics_btw186 crossref_primary_10_2174_1574893616666211123094301 crossref_primary_10_1371_journal_pcbi_1006484 crossref_primary_10_1016_j_ab_2018_10_018 crossref_primary_10_1089_cmb_2020_0284 crossref_primary_10_1371_journal_pcbi_1005795 crossref_primary_10_1093_bioinformatics_bts028 crossref_primary_10_15302_J_QB_022_0322 crossref_primary_10_1186_1471_2164_13_S7_S11 crossref_primary_10_1016_j_compbiomed_2023_107242 crossref_primary_10_1016_j_neucom_2019_10_091 crossref_primary_10_1093_bib_bbz101 crossref_primary_10_1186_1471_2164_16_S7_S11 crossref_primary_10_1016_j_gpb_2013_04_002 crossref_primary_10_1101_gr_234633_118 crossref_primary_10_1186_s12859_017_1878_3 crossref_primary_10_1073_pnas_1808833115 crossref_primary_10_1002_adbi_202200232 crossref_primary_10_1038_s41467_022_32165_8 crossref_primary_10_1093_bib_bbt078 crossref_primary_10_1101_gr_173518_114 crossref_primary_10_1016_j_isci_2019_10_055 crossref_primary_10_7554_eLife_67403 crossref_primary_10_1101_gr_146506_112 crossref_primary_10_1109_TCBB_2017_2691325 crossref_primary_10_1038_s41467_017_01982_7 crossref_primary_10_1016_j_compbiolchem_2018_03_019 crossref_primary_10_1109_ACCESS_2023_3284464 crossref_primary_10_1186_s12918_017_0389_1 crossref_primary_10_1038_ng_3331 crossref_primary_10_1016_j_omtn_2019_04_019 crossref_primary_10_1007_s00285_013_0705_3 crossref_primary_10_1101_gr_169243_113 crossref_primary_10_37394_23208_2023_20_12 crossref_primary_10_1007_s12539_022_00503_5 crossref_primary_10_1371_journal_pcbi_1005720 crossref_primary_10_3390_insects12070591 crossref_primary_10_1038_srep32476 crossref_primary_10_1093_gbe_evu184 crossref_primary_10_3390_genes13040568 crossref_primary_10_1016_j_compbiolchem_2024_108077 crossref_primary_10_1038_ng_2713 crossref_primary_10_1371_journal_pcbi_1009376 crossref_primary_10_1038_s41598_017_03554_7 crossref_primary_10_1073_pnas_2212810119 crossref_primary_10_1093_nar_gkab122 crossref_primary_10_1371_journal_pone_0169249 crossref_primary_10_1146_annurev_genom_121719_010946 crossref_primary_10_3390_ijms22158067 crossref_primary_10_1093_bib_bby110 crossref_primary_10_3390_ijms23158221 crossref_primary_10_1109_TCBB_2022_3142019 crossref_primary_10_1186_gb_2013_14_10_r117 crossref_primary_10_1016_j_bbagrm_2019_194443 crossref_primary_10_1186_gb_2013_14_5_205 crossref_primary_10_3389_fsysb_2024_1402664 crossref_primary_10_1016_j_ab_2019_02_017 crossref_primary_10_3389_fgene_2021_798107 crossref_primary_10_3389_fgene_2019_01305 crossref_primary_10_1016_j_cell_2016_04_048 crossref_primary_10_1109_TCBB_2019_2909237 crossref_primary_10_1038_s41467_020_19921_4 crossref_primary_10_18632_oncotarget_14524 crossref_primary_10_1101_gr_132811_111 crossref_primary_10_1002_wdev_168 crossref_primary_10_1038_nature12753 crossref_primary_10_1080_15476286_2020_1734382 crossref_primary_10_1002_wsbm_1165 crossref_primary_10_1038_nbt_4138 crossref_primary_10_1242_dev_142554 crossref_primary_10_1038_s10038_024_01256_3 crossref_primary_10_1007_s00438_015_1078_7 |
Cites_doi | 10.1126/science.1160930 10.1016/S0092-8674(03)01078-X 10.1101/gr.3642605 10.1371/journal.pcbi.1000173 10.1016/0092-8674(81)90413-X 10.1371/journal.pone.0006901 10.1093/bioinformatics/btn170 10.1101/sqb.1998.63.609 10.1016/j.cell.2008.05.024 10.1093/nar/gkm966 10.1093/bioinformatics/bti1053 10.1101/gr.716103 10.1093/bioinformatics/18.1.147 10.1101/gr.098657.109 10.1093/bioinformatics/17.suppl_1.S207 10.1007/978-1-60761-854-6_13 10.1093/bioinformatics/16.10.906 10.1038/nature07829 10.1146/annurev.neuro.31.060407.125631 10.1006/jmbi.2000.3519 10.1073/pnas.1530509100 10.1038/nature07730 10.1186/1471-2105-5-169 10.1016/S0896-6273(03)00365-9 10.1016/S0959-4388(97)80115-8 10.1523/JNEUROSCI.20-02-00709.2000 10.1145/130385.130401 10.1007/978-1-4757-2440-0 10.1126/science.1141319 10.1038/nature05295 10.1126/science.1124070 10.1126/science.281.5373.60 10.1101/gr.104471.109 10.1093/bioinformatics/btp278 10.1016/j.neuron.2010.06.006 10.1101/gr.085449.108 10.1016/j.cell.2005.10.042 10.1093/nar/gkm955 10.1074/jbc.M109.063032 10.1038/ng1966 10.1038/nature09033 10.1073/pnas.0400611101 10.1101/gr.6101007 10.1186/gb-2007-8-2-r24 10.1523/JNEUROSCI.13-07-03155.1993 10.1038/nmeth1068 10.1016/S0092-8674(04)00304-6 10.1093/nar/gkg108 10.1186/1471-2105-8-S10-S7 10.1093/bioinformatics/btg431 10.1371/journal.pcbi.1001020 10.1242/jcs.114.13.2363 10.1101/gad.9.21.2646 10.7551/mitpress/4057.001.0001 10.1111/j.1399-0004.2008.00967.x 10.1038/ng.759 10.1101/gr.817703 10.1101/gr.6929408 10.1038/ng1051 10.1038/ng.2007.55 10.1101/gr.3715005 10.1016/0092-8674(95)90136-1 10.1371/journal.pcbi.1001070 10.1371/journal.pbio.0030007 10.1016/S0959-437X(02)00323-4 10.1038/nature05874 10.1242/dev.01220 10.1093/bioinformatics/btl250 10.1093/nar/gkn660 10.1038/nrn874 10.1146/annurev-genom-082509-141651 10.1016/j.cell.2008.04.043 10.7551/mitpress/1130.003.0015 |
ContentType | Journal Article |
Copyright | Copyright © 2011 by Cold Spring Harbor Laboratory Press 2011 |
Copyright_xml | – notice: Copyright © 2011 by Cold Spring Harbor Laboratory Press 2011 |
DBID | CGR CUY CVF ECM EIF NPM AAYXX CITATION 7X8 7TM 8FD FR3 P64 RC3 5PM |
DOI | 10.1101/gr.121905.111 |
DatabaseName | Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed CrossRef MEDLINE - Academic Nucleic Acids Abstracts Technology Research Database Engineering Research Database Biotechnology and BioEngineering Abstracts Genetics Abstracts PubMed Central (Full Participant titles) |
DatabaseTitle | MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) CrossRef MEDLINE - Academic Genetics Abstracts Engineering Research Database Technology Research Database Nucleic Acids Abstracts Biotechnology and BioEngineering Abstracts |
DatabaseTitleList | MEDLINE Genetics Abstracts CrossRef |
Database_xml | – sequence: 1 dbid: ECM name: MEDLINE url: https://search.ebscohost.com/login.aspx?direct=true&db=cmedm&site=ehost-live sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Anatomy & Physiology Chemistry Biology |
DocumentTitleAlternate | Lee et al |
EISSN | 1549-5469 |
EndPage | 2180 |
ExternalDocumentID | 10_1101_gr_121905_111 21875935 |
Genre | Research Support, U.S. Gov't, Non-P.H.S Research Support, Non-U.S. Gov't Journal Article Research Support, N.I.H., Extramural |
GrantInformation_xml | – fundername: NINDS NIH HHS grantid: NS062972 – fundername: NINDS NIH HHS grantid: R01 NS062972 |
GroupedDBID | --- .GJ 18M 29H 2WC 39C 4.4 53G 5GY 5RE 5VS AAYOK AAZTW ABDIX ABDNZ ACGFO ACYGS ADBBV ADNWM AEILP AENEX AI. ALMA_UNASSIGNED_HOLDINGS BAWUL BTFSW C1A CGR CS3 CUY CVF DIK DU5 E3Z EBS ECM EIF EJD F5P FRP GX1 H13 HYE IH2 K-O KQ8 MV1 NPM R.V RCX RHF RHI RNS RPM RXW SJN TAE TR2 VH1 W8F WOQ YKV ZCG ZGI ZXP AAYXX ABRJW CITATION 7X8 7TM 8FD FR3 P64 RC3 5PM |
ID | FETCH-LOGICAL-c484t-3c80d6460732467c59fa67325c48218ee9555c10e5063e3cd900a60fa3d384d03 |
IEDL.DBID | RPM |
ISSN | 1088-9051 |
IngestDate | Tue Sep 17 21:21:32 EDT 2024 Fri Oct 25 22:52:59 EDT 2024 Fri Oct 25 11:21:06 EDT 2024 Thu Sep 12 16:29:39 EDT 2024 Tue Oct 15 23:42:28 EDT 2024 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 12 |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c484t-3c80d6460732467c59fa67325c48218ee9555c10e5063e3cd900a60fa3d384d03 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 ObjectType-Article-2 ObjectType-Feature-1 |
OpenAccessLink | https://genome.cshlp.org/content/21/12/2167.full.pdf |
PMID | 21875935 |
PQID | 907999466 |
PQPubID | 23479 |
PageCount | 14 |
ParticipantIDs | pubmedcentral_primary_oai_pubmedcentral_nih_gov_3227105 proquest_miscellaneous_915482283 proquest_miscellaneous_907999466 crossref_primary_10_1101_gr_121905_111 pubmed_primary_21875935 |
PublicationCentury | 2000 |
PublicationDate | 2011-12-01 |
PublicationDateYYYYMMDD | 2011-12-01 |
PublicationDate_xml | – month: 12 year: 2011 text: 2011-12-01 day: 01 |
PublicationDecade | 2010 |
PublicationPlace | United States |
PublicationPlace_xml | – name: United States |
PublicationTitle | Genome research |
PublicationTitleAlternate | Genome Res |
PublicationYear | 2011 |
Publisher | Cold Spring Harbor Laboratory Press |
Publisher_xml | – name: Cold Spring Harbor Laboratory Press |
References | 16024819 - Genome Res. 2005 Aug;15(8):1034-50 20075146 - Genome Res. 2010 Mar;20(3):381-92 17558387 - Nat Methods. 2007 Aug;4(8):651-7 12670995 - Genome Res. 2003 Apr;13(4):533-43 11559745 - J Cell Sci. 2001 Jul;114(Pt 13):2363-73 21152003 - PLoS Comput Biol. 2010;6(12):e1001020 11928508 - Pac Symp Biocomput. 2002;:564-75 12094208 - Nat Rev Neurosci. 2002 Jul;3(7):517-30 18586746 - Bioinformatics. 2008 Jul 1;24(13):i6-14 18241223 - Clin Genet. 2008 Mar;73(3):212-26 19011757 - Cell Mol Life Sci. 2009 Mar;66(5):773-87 12100890 - Curr Opin Genet Dev. 2002 Aug;12(4):441-6 7590242 - Genes Dev. 1995 Nov 1;9(21):2646-58 16873509 - Bioinformatics. 2006 Jul 15;22(14):e472-80 19389732 - Bioinformatics. 2009 Aug 15;25(16):2126-33 18071029 - Genome Res. 2008 Feb;18(2):252-60 15961480 - Bioinformatics. 2005 Jun;21 Suppl 1:i369-77 6277502 - Cell. 1981 Dec;27(2 Pt 1):299-308 18086701 - Nucleic Acids Res. 2008 Jan;36(Database issue):D773-9 10384326 - Cold Spring Harb Symp Quant Biol. 1998;63:609-20 17324271 - Genome Biol. 2007;8(2):R24 18555785 - Cell. 2008 Jun 13;133(6):1106-17 19730735 - PLoS One. 2009;4(9):e6901 20438361 - Annu Rev Genomics Hum Genet. 2010;11:1-23 20827594 - Methods Mol Biol. 2010;674:213-23 16024817 - Genome Res. 2005 Aug;15(8):1051-60 10632600 - J Neurosci. 2000 Jan 15;20(2):709-21 18558867 - Annu Rev Neurosci. 2008;31:563-90 8548797 - Cell. 1995 Dec 29;83(7):1091-100 16556802 - Science. 2006 Apr 14;312(5771):276-9 21347314 - PLoS Comput Biol. 2011;7(2):e1001070 11473011 - Bioinformatics. 2001;17 Suppl 1:S207-14 17086198 - Nature. 2006 Nov 23;444(7118):499-502 18176564 - Nat Genet. 2008 Feb;40(2):158-60 11120680 - Bioinformatics. 2000 Oct;16(10):906-14 15511290 - BMC Bioinformatics. 2004 Oct 28;5:169 7687285 - J Neurosci. 1993 Jul;13(7):3155-72 18269701 - BMC Bioinformatics. 2007;8 Suppl 10:S7 18006571 - Nucleic Acids Res. 2008 Jan;36(Database issue):D102-6 15026577 - Proc Natl Acad Sci U S A. 2004 Mar 16;101(11):3851-6 7584402 - Proc Int Conf Intell Syst Mol Biol. 1994;2:28-36 20393465 - Nature. 2010 May 13;465(7295):182-7 20670838 - Neuron. 2010 Jul 29;67(2):321-34 16413481 - Cell. 2006 Jan 13;124(1):47-59 19141595 - Genome Res. 2009 Apr;19(4):644-56 19058033 - Mol Neurobiol. 2009 Feb;39(1):10-23 15630479 - PLoS Biol. 2005 Jan;3(1):e7 12426570 - Nat Genet. 2002 Dec;32(4):623-6 17540862 - Science. 2007 Jun 8;316(5830):1497-502 18585359 - Cell. 2008 Jun 27;133(7):1266-76 12848929 - Neuron. 2003 Jul 3;39(1):13-25 15201224 - Development. 2004 Jul;131(14):3319-31 21258342 - Nat Genet. 2011 Mar;43(3):264-8 10698627 - J Mol Biol. 2000 Mar 10;296(5):1205-14 9679020 - Science. 1998 Jul 3;281(5373):60-3 12883005 - Proc Natl Acad Sci U S A. 2003 Aug 5;100(16):9440-5 18842628 - Nucleic Acids Res. 2009 Jan;37(Database issue):D77-82 14990442 - Bioinformatics. 2004 Mar 1;20(4):467-76 17277777 - Nat Genet. 2007 Mar;39(3):311-8 12529307 - Genome Res. 2003 Jan;13(1):64-72 15084257 - Cell. 2004 Apr 16;117(2):185-98 20363979 - Genome Res. 2010 May;20(5):565-77 18787134 - Science. 2008 Oct 17;322(5900):434-8 19887448 - J Biol Chem. 2010 Jan 8;285(2):1393-403 11836223 - Bioinformatics. 2002 Jan;18(1):147-59 19295514 - Nature. 2009 May 7;459(7243):108-12 18974822 - PLoS Comput Biol. 2008 Oct;4(10):e1000173 17571346 - Nature. 2007 Jun 14;447(7146):799-816 14744435 - Cell. 2004 Jan 23;116(2):247-57 19212405 - Nature. 2009 Feb 12;457(7231):854-8 17620451 - Genome Res. 2007 Aug;17(8):1170-7 9039799 - Curr Opin Neurobiol. 1997 Feb;7(1):13-20 12520026 - Nucleic Acids Res. 2003 Jan 1;31(1):374-8 (2021111811085639000_21.12.2167.79) 2000; 20 (2021111811085639000_21.12.2167.41) 2002; 7 2021111811085639000_21.12.2167.60 2021111811085639000_21.12.2167.62 2021111811085639000_21.12.2167.63 2021111811085639000_21.12.2167.20 2021111811085639000_21.12.2167.21 2021111811085639000_21.12.2167.65 2021111811085639000_21.12.2167.22 2021111811085639000_21.12.2167.23 2021111811085639000_21.12.2167.67 2021111811085639000_21.12.2167.24 2021111811085639000_21.12.2167.68 2021111811085639000_21.12.2167.25 2021111811085639000_21.12.2167.69 2021111811085639000_21.12.2167.26 (2021111811085639000_21.12.2167.61) 2010; 674 2021111811085639000_21.12.2167.27 2021111811085639000_21.12.2167.28 2021111811085639000_21.12.2167.29 2021111811085639000_21.12.2167.70 2021111811085639000_21.12.2167.72 2021111811085639000_21.12.2167.73 2021111811085639000_21.12.2167.30 2021111811085639000_21.12.2167.74 2021111811085639000_21.12.2167.31 2021111811085639000_21.12.2167.75 2021111811085639000_21.12.2167.32 2021111811085639000_21.12.2167.76 2021111811085639000_21.12.2167.33 2021111811085639000_21.12.2167.77 2021111811085639000_21.12.2167.34 2021111811085639000_21.12.2167.78 2021111811085639000_21.12.2167.35 2021111811085639000_21.12.2167.36 (2021111811085639000_21.12.2167.1) 1994; 2 2021111811085639000_21.12.2167.37 2021111811085639000_21.12.2167.6 2021111811085639000_21.12.2167.39 2021111811085639000_21.12.2167.7 (2021111811085639000_21.12.2167.10) 1993; 13 2021111811085639000_21.12.2167.8 2021111811085639000_21.12.2167.9 2021111811085639000_21.12.2167.2 2021111811085639000_21.12.2167.3 2021111811085639000_21.12.2167.4 2021111811085639000_21.12.2167.5 (2021111811085639000_21.12.2167.38) 2007; 8 2021111811085639000_21.12.2167.40 (2021111811085639000_21.12.2167.64) 2006; 7 2021111811085639000_21.12.2167.42 2021111811085639000_21.12.2167.43 2021111811085639000_21.12.2167.44 2021111811085639000_21.12.2167.46 2021111811085639000_21.12.2167.47 (2021111811085639000_21.12.2167.71) 2008; 66 2021111811085639000_21.12.2167.48 2021111811085639000_21.12.2167.49 (2021111811085639000_21.12.2167.66) 2007; 8 (2021111811085639000_21.12.2167.45) 2008; 39 2021111811085639000_21.12.2167.50 2021111811085639000_21.12.2167.51 2021111811085639000_21.12.2167.52 2021111811085639000_21.12.2167.53 2021111811085639000_21.12.2167.54 2021111811085639000_21.12.2167.11 2021111811085639000_21.12.2167.55 (2021111811085639000_21.12.2167.12) 2001; 114 2021111811085639000_21.12.2167.56 2021111811085639000_21.12.2167.13 2021111811085639000_21.12.2167.57 2021111811085639000_21.12.2167.14 2021111811085639000_21.12.2167.58 2021111811085639000_21.12.2167.15 2021111811085639000_21.12.2167.59 2021111811085639000_21.12.2167.16 2021111811085639000_21.12.2167.17 2021111811085639000_21.12.2167.18 2021111811085639000_21.12.2167.19 |
References_xml | – ident: 2021111811085639000_21.12.2167.77 doi: 10.1126/science.1160930 – ident: 2021111811085639000_21.12.2167.32 doi: 10.1016/S0092-8674(03)01078-X – ident: 2021111811085639000_21.12.2167.37 doi: 10.1101/gr.3642605 – ident: 2021111811085639000_21.12.2167.4 doi: 10.1371/journal.pcbi.1000173 – ident: 2021111811085639000_21.12.2167.2 doi: 10.1016/0092-8674(81)90413-X – ident: 2021111811085639000_21.12.2167.43 doi: 10.1371/journal.pone.0006901 – ident: 2021111811085639000_21.12.2167.67 doi: 10.1093/bioinformatics/btn170 – ident: 2021111811085639000_21.12.2167.44 doi: 10.1101/sqb.1998.63.609 – ident: 2021111811085639000_21.12.2167.5 doi: 10.1016/j.cell.2008.05.024 – ident: 2021111811085639000_21.12.2167.34 doi: 10.1093/nar/gkm966 – ident: 2021111811085639000_21.12.2167.57 doi: 10.1093/bioinformatics/bti1053 – ident: 2021111811085639000_21.12.2167.22 doi: 10.1101/gr.716103 – ident: 2021111811085639000_21.12.2167.33 doi: 10.1093/bioinformatics/18.1.147 – ident: 2021111811085639000_21.12.2167.51 doi: 10.1101/gr.098657.109 – ident: 2021111811085639000_21.12.2167.54 doi: 10.1093/bioinformatics/17.suppl_1.S207 – volume: 674 start-page: 213 volume-title: Computational biology of transcription factor binding year: 2010 ident: 2021111811085639000_21.12.2167.61 article-title: Kernel-based identification of regulatory modules doi: 10.1007/978-1-60761-854-6_13 – volume: 66 start-page: 773 year: 2008 ident: 2021111811085639000_21.12.2167.71 article-title: The role of the ZEB family of transcription factors in development and disease publication-title: Cell Mol Life Sci – ident: 2021111811085639000_21.12.2167.21 doi: 10.1093/bioinformatics/16.10.906 – ident: 2021111811085639000_21.12.2167.27 doi: 10.1038/nature07829 – ident: 2021111811085639000_21.12.2167.19 doi: 10.1146/annurev.neuro.31.060407.125631 – ident: 2021111811085639000_21.12.2167.28 doi: 10.1006/jmbi.2000.3519 – ident: 2021111811085639000_21.12.2167.68 doi: 10.1073/pnas.1530509100 – ident: 2021111811085639000_21.12.2167.74 doi: 10.1038/nature07730 – ident: 2021111811085639000_21.12.2167.50 doi: 10.1186/1471-2105-5-169 – ident: 2021111811085639000_21.12.2167.59 doi: 10.1016/S0896-6273(03)00365-9 – ident: 2021111811085639000_21.12.2167.40 doi: 10.1016/S0959-4388(97)80115-8 – volume: 20 start-page: 709 year: 2000 ident: 2021111811085639000_21.12.2167.79 article-title: A highly conserved enhancer in the Dlx5/Dlx6 intergenic region is the site of cross-regulatory interactions between Dlx genes in the embryonic forebrain publication-title: J Neurosci doi: 10.1523/JNEUROSCI.20-02-00709.2000 – ident: 2021111811085639000_21.12.2167.8 doi: 10.1145/130385.130401 – ident: 2021111811085639000_21.12.2167.72 doi: 10.1007/978-1-4757-2440-0 – ident: 2021111811085639000_21.12.2167.31 doi: 10.1126/science.1141319 – ident: 2021111811085639000_21.12.2167.56 doi: 10.1038/nature05295 – ident: 2021111811085639000_21.12.2167.18 doi: 10.1126/science.1124070 – ident: 2021111811085639000_21.12.2167.7 doi: 10.1126/science.281.5373.60 – ident: 2021111811085639000_21.12.2167.23 doi: 10.1101/gr.104471.109 – ident: 2021111811085639000_21.12.2167.62 doi: 10.1093/bioinformatics/btp278 – ident: 2021111811085639000_21.12.2167.14 doi: 10.1016/j.neuron.2010.06.006 – ident: 2021111811085639000_21.12.2167.49 doi: 10.1101/gr.085449.108 – ident: 2021111811085639000_21.12.2167.25 doi: 10.1016/j.cell.2005.10.042 – ident: 2021111811085639000_21.12.2167.9 doi: 10.1093/nar/gkm955 – ident: 2021111811085639000_21.12.2167.20 doi: 10.1074/jbc.M109.063032 – ident: 2021111811085639000_21.12.2167.26 doi: 10.1038/ng1966 – ident: 2021111811085639000_21.12.2167.36 doi: 10.1038/nature09033 – ident: 2021111811085639000_21.12.2167.17 doi: 10.1073/pnas.0400611101 – ident: 2021111811085639000_21.12.2167.55 doi: 10.1101/gr.6101007 – volume: 39 start-page: 10 year: 2008 ident: 2021111811085639000_21.12.2167.45 article-title: Nuclear factor one transcription factors in CNS development publication-title: Mol Neurobiol – ident: 2021111811085639000_21.12.2167.24 doi: 10.1186/gb-2007-8-2-r24 – volume: 2 start-page: 28 year: 1994 ident: 2021111811085639000_21.12.2167.1 article-title: Fitting a mixture model by expectation maximization to discover motifs in biopolymers publication-title: Proc Int Conf Intell Syst Mol Biol – volume: 8 start-page: 1519 year: 2007 ident: 2021111811085639000_21.12.2167.38 article-title: An interior-point method for large-scale l1-regularized logistic regression publication-title: J Mach Learn Res – volume: 13 start-page: 3155 year: 1993 ident: 2021111811085639000_21.12.2167.10 article-title: Spatially restricted expression of Dlx-1, Dlx-2 (Tes-1), Gbx-2, and Wnt-3 in the embryonic day 12.5 mouse forebrain defines potential transverse and longitudinal segmental boundaries publication-title: J Neurosci doi: 10.1523/JNEUROSCI.13-07-03155.1993 – ident: 2021111811085639000_21.12.2167.58 doi: 10.1038/nmeth1068 – ident: 2021111811085639000_21.12.2167.3 doi: 10.1016/S0092-8674(04)00304-6 – ident: 2021111811085639000_21.12.2167.47 doi: 10.1093/nar/gkg108 – volume: 7 start-page: 564 year: 2002 ident: 2021111811085639000_21.12.2167.41 article-title: The spectrum kernel: A string kernel for SVM protein classification publication-title: Pac Symp Biocomput – volume: 8 start-page: S7 year: 2007 ident: 2021111811085639000_21.12.2167.66 article-title: Accurate splice site prediction using support vector machines publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-8-S10-S7 – ident: 2021111811085639000_21.12.2167.42 doi: 10.1093/bioinformatics/btg431 – volume: 7 start-page: 1531 year: 2006 ident: 2021111811085639000_21.12.2167.64 article-title: Large scale multiple kernel learning publication-title: J Mach Learn Res – ident: 2021111811085639000_21.12.2167.69 doi: 10.1371/journal.pcbi.1001020 – volume: 114 start-page: 2363 year: 2001 ident: 2021111811085639000_21.12.2167.12 article-title: P300/CBP proteins: HATs for transcriptional bridges and scaffolds publication-title: J Cell Sci doi: 10.1242/jcs.114.13.2363 – ident: 2021111811085639000_21.12.2167.46 doi: 10.1101/gad.9.21.2646 – ident: 2021111811085639000_21.12.2167.60 doi: 10.7551/mitpress/4057.001.0001 – ident: 2021111811085639000_21.12.2167.75 doi: 10.1111/j.1399-0004.2008.00967.x – ident: 2021111811085639000_21.12.2167.30 doi: 10.1038/ng.759 – ident: 2021111811085639000_21.12.2167.15 doi: 10.1101/gr.817703 – ident: 2021111811085639000_21.12.2167.48 doi: 10.1101/gr.6929408 – ident: 2021111811085639000_21.12.2167.11 doi: 10.1038/ng1051 – ident: 2021111811085639000_21.12.2167.73 doi: 10.1038/ng.2007.55 – ident: 2021111811085639000_21.12.2167.63 doi: 10.1101/gr.3715005 – ident: 2021111811085639000_21.12.2167.70 doi: 10.1016/0092-8674(95)90136-1 – ident: 2021111811085639000_21.12.2167.35 doi: 10.1371/journal.pcbi.1001070 – ident: 2021111811085639000_21.12.2167.78 doi: 10.1371/journal.pbio.0030007 – ident: 2021111811085639000_21.12.2167.76 doi: 10.1016/S0959-437X(02)00323-4 – ident: 2021111811085639000_21.12.2167.16 doi: 10.1038/nature05874 – ident: 2021111811085639000_21.12.2167.39 doi: 10.1242/dev.01220 – ident: 2021111811085639000_21.12.2167.65 doi: 10.1093/bioinformatics/btl250 – ident: 2021111811085639000_21.12.2167.52 doi: 10.1093/nar/gkn660 – ident: 2021111811085639000_21.12.2167.6 doi: 10.1038/nrn874 – ident: 2021111811085639000_21.12.2167.53 doi: 10.1146/annurev-genom-082509-141651 – ident: 2021111811085639000_21.12.2167.13 doi: 10.1016/j.cell.2008.04.043 – ident: 2021111811085639000_21.12.2167.29 doi: 10.7551/mitpress/1130.003.0015 |
SSID | ssj0003488 |
Score | 2.5289013 |
Snippet | Accurately predicting regulatory sequences and enhancers in entire genomes is an important but difficult problem, especially in large vertebrate genomes. With... |
SourceID | pubmedcentral proquest crossref pubmed |
SourceType | Open Access Repository Aggregation Database Index Database |
StartPage | 2167 |
SubjectTerms | Animals Cerebral Cortex - cytology Cerebral Cortex - metabolism CREB-Binding Protein Genome - physiology Genome-Wide Association Study - methods Method Mice Neurons - cytology Neurons - metabolism Oligonucleotide Array Sequence Analysis - methods Organ Specificity - physiology Response Elements - physiology Sequence Analysis, DNA |
Title | Discriminative prediction of mammalian enhancers from DNA sequence |
URI | https://www.ncbi.nlm.nih.gov/pubmed/21875935 https://search.proquest.com/docview/907999466 https://search.proquest.com/docview/915482283 https://pubmed.ncbi.nlm.nih.gov/PMC3227105 |
Volume | 21 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT-MwEB5RJLRcEK-F8pIPK26hTmK7ybG0VL2AkBYkbpHjOFBpk1YpPfDvmXHqCliJAzcrdpRkZjKesb_5DPBHFWVYFEYHSVSGbpsx0Hk_CoxUFicwk0clFSdP_vbvnpLRDdHkSF8L40D7Jp9e1f-qq3r64rCV88r0PE6sd387RCPEiVH2OtDB2NCn6Cv3G4ukrX9DEyDyqTWxZth7bohMAS-So3A0wBitp-6ktw9z0n-B5le85IcJaLwLO6vIkQ3aN9yDDVvvw8Ggxqy5emOXzGE53SL5Pmxd-9avoT_R7QCuR1NyEgR-ISfH5g3t0pBm2Kxkla4qt-jBbP1CttAsGBWfsNHdgHnE9SE8jm8ehpNgdYZCYEQiXoPYJLxQQuGfHKFPNDIttcK2xG78fGtTKaUJuZUYq9jYFCnnWvFSx0WciILHv2GzntX2GJjRWkTcxjIpldAizTXXKF9jI5v2cxt14dJLMZu3VBmZSzF4mD03WSt5yji6wLyMM5QA7VDo2s6WiwwzdQxYhVLfDKEcizh7unDUamX9MK_OLvQ_6Ws9gKi0P_eghTlK7ZVFnfz4zlPYdqvNDuhyBpuvzdKeQ2dRLC-ceb4DIpPpHg |
link.rule.ids | 230,315,729,782,786,887,27935,27936,53803,53805 |
linkProvider | National Library of Medicine |
linkToHtml | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3PT9swFH4aTBNcYIMxyo_NB8Qt1ElsNzmWFtRpUCHBpN0ix3Gg0pJWKT3w3_OeU1cwpB24RbGjRP5ent-zv_cZ4EQVZVgURgdJVIZumzHQeS8KjFQWJzCTRyUVJ49ue-M_yfCCZHKkr4VxpH2TT87qv9VZPXlw3MpZZbqeJ9a9uR6gEeLEKLtr8BH_Vx77JH3pgGORtBVwaAQkP7WS1gy79w3JKeBNchVOCBjj9dSd9fZiVnoTav7LmHwxBV1uv_PjP8PWMuZk_bb5C3yw9Q7s9mvMt6sndsocC9Qtr-_Ap3N_tTHwZ8HtwvlwQu6FaDPkHtmsof0dwpRNS1bpqnLLJczWD2RFzZxR2QobjvvMc7W_wu_Li7vBKFievhAYkYjHIDYJL5RQ6AMi9KZGpqVWeC2xGYfN2lRKaUJuJUY5NjZFyrlWvNRxESei4PEerNfT2u4DM1qLiNtYJqUSWqS55hpxMTayaS-3UQdO_ehns1ZkI3PJCQ-z-yZrEaNcpQPMY5PhCNDehq7tdDHPMMfHUFco9Z8ulJ2R2k8HvrVorl7mzaADvVc4rzqQCPfrFoTXiXEv4Tx495M_YGN0d32VXf0c_zqETbdm7egyR7D-2CzsMazNi8V3Z-LPxP3-nQ |
linkToPdf | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3PT9swFH4aTPtxGQzY1gHDh4lbiJPYbnIsLRVoUCFtk3aLHNuBSktapfSw_37vOU0FQ9oBblbiKIq_l-f37M_fA_iqbBlZa3SQxmXktxkDXfTjwEjlcAIzRVzS4eTz7_3Jr3R0RjI561JfnrRviulJ_bs6qae3nls5r0zY8cTC66shGiFOjDKc2zLcgJf4z3LZJeorJ5yItD0Fh4ZAElRrec0ovGlIUgEvkrvwYsAYs2e-3tu9melRuPkva_LeNDTeesYHbMO7VezJBm2X9_DC1TuwO6gx767-sGPm2aB-mX0HXp12rTfDribcLpyOpuRmiD5DbpLNG9rnIWzZrGSVriq_bMJcfUvW1CwYHV9ho8mAdZztPfg5PvsxPA9WVRgCI1JxFyQm5VYJhb4gRq9qZFZqhW2Jt3HonMuklCbiTmK04xJjM8614qVObJIKy5MPsFnPavcJmNFaxNwlMi2V0CIrNNeIjXGxy_qFi3tw3CGQz1uxjdwnKTzKb5q8RY1ylh6wDp8cR4D2OHTtZstFjrk-hrxCqf90oSyNVH968LFFdP2yzhR60H-A9boDiXE_vIMQe1HuFaSfn_zkEby-Ho3zy4vJt31465euPWvmADbvmqU7hI2FXX7xVv4X3ewBLA |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Discriminative+prediction+of+mammalian+enhancers+from+DNA+sequence&rft.jtitle=Genome+research&rft.au=Lee%2C+Dongwon&rft.au=Karchin%2C+Rachel&rft.au=Beer%2C+Michael+A&rft.date=2011-12-01&rft.issn=1088-9051&rft.volume=21&rft.issue=12&rft.spage=2167&rft.epage=2180&rft_id=info:doi/10.1101%2Fgr.121905.111&rft.externalDBID=NO_FULL_TEXT |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1088-9051&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1088-9051&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1088-9051&client=summon |