Discriminative prediction of mammalian enhancers from DNA sequence

Accurately predicting regulatory sequences and enhancers in entire genomes is an important but difficult problem, especially in large vertebrate genomes. With the advent of ChIP-seq technology, experimental detection of genome-wide EP300/CREBBP bound regions provides a powerful platform to develop p...

Full description

Saved in:
Bibliographic Details
Published in:Genome research Vol. 21; no. 12; pp. 2167 - 2180
Main Authors: Lee, Dongwon, Karchin, Rachel, Beer, Michael A
Format: Journal Article
Language:English
Published: United States Cold Spring Harbor Laboratory Press 01-12-2011
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Accurately predicting regulatory sequences and enhancers in entire genomes is an important but difficult problem, especially in large vertebrate genomes. With the advent of ChIP-seq technology, experimental detection of genome-wide EP300/CREBBP bound regions provides a powerful platform to develop predictive tools for regulatory sequences and to study their sequence properties. Here, we develop a support vector machine (SVM) framework which can accurately identify EP300-bound enhancers using only genomic sequence and an unbiased set of general sequence features. Moreover, we find that the predictive sequence features identified by the SVM classifier reveal biologically relevant sequence elements enriched in the enhancers, but we also identify other features that are significantly depleted in enhancers. The predictive sequence features are evolutionarily conserved and spatially clustered, providing further support of their functional significance. Although our SVM is trained on experimental data, we also predict novel enhancers and show that these putative enhancers are significantly enriched in both ChIP-seq signal and DNase I hypersensitivity signal in the mouse brain and are located near relevant genes. Finally, we present results of comparisons between other EP300/CREBBP data sets using our SVM and uncover sequence elements enriched and/or depleted in the different classes of enhancers. Many of these sequence features play a role in specifying tissue-specific or developmental-stage-specific enhancer activity, but our results indicate that some features operate in a general or tissue-independent manner. In addition to providing a high confidence list of enhancer targets for subsequent experimental investigation, these results contribute to our understanding of the general sequence structure of vertebrate enhancers.
AbstractList Accurately predicting regulatory sequences and enhancers in entire genomes is an important but difficult problem, especially in large vertebrate genomes. With the advent of ChIP-seq technology, experimental detection of genome-wide EP300/CREBBP bound regions provides a powerful platform to develop predictive tools for regulatory sequences and to study their sequence properties. Here, we develop a support vector machine (SVM) framework which can accurately identify EP300-bound enhancers using only genomic sequence and an unbiased set of general sequence features. Moreover, we find that the predictive sequence features identified by the SVM classifier reveal biologically relevant sequence elements enriched in the enhancers, but we also identify other features that are significantly depleted in enhancers. The predictive sequence features are evolutionarily conserved and spatially clustered, providing further support of their functional significance. Although our SVM is trained on experimental data, we also predict novel enhancers and show that these putative enhancers are significantly enriched in both ChIP-seq signal and DNase I hypersensitivity signal in the mouse brain and are located near relevant genes. Finally, we present results of comparisons between other EP300/CREBBP data sets using our SVM and uncover sequence elements enriched and/or depleted in the different classes of enhancers. Many of these sequence features play a role in specifying tissue-specific or developmental-stage-specific enhancer activity, but our results indicate that some features operate in a general or tissue-independent manner. In addition to providing a high confidence list of enhancer targets for subsequent experimental investigation, these results contribute to our understanding of the general sequence structure of vertebrate enhancers.
Author Beer, Michael A
Lee, Dongwon
Karchin, Rachel
AuthorAffiliation 2 Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland 21218, USA
3 McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland 21205, USA
1 Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21205, USA
AuthorAffiliation_xml – name: 2 Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland 21218, USA
– name: 1 Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21205, USA
– name: 3 McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland 21205, USA
Author_xml – sequence: 1
  givenname: Dongwon
  surname: Lee
  fullname: Lee, Dongwon
  organization: Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21205, USA
– sequence: 2
  givenname: Rachel
  surname: Karchin
  fullname: Karchin, Rachel
– sequence: 3
  givenname: Michael A
  surname: Beer
  fullname: Beer, Michael A
BackLink https://www.ncbi.nlm.nih.gov/pubmed/21875935$$D View this record in MEDLINE/PubMed
BookMark eNqFUblOxTAQtBCIu6RF6agC6_iI3SBxg4SggdoyzuZhlNgPOw-Jv8foAYKKao8ZjWZ3tshqiAEJ2aNwSCnQo1k6pA3VIMpIV8gmFVzXgku9WnpQqi4Q3SBbOb8AAONKrZONhqpWaCY2yem5zy750Qc7-Tes5gk77yYfQxX7arTjaAdvQ4Xh2QaHKVd9imN1fndSZXxdYNntkLXeDhl3v-o2eby8eDi7rm_vr27OTm5rxxWfauYUdJJLaFnDZeuE7q0svShw8YOohRCOAgqQDJnrNICV0FvWMcU7YNvkeKk7XzyN2DkMU7KDmRf3Nr2baL35iwT_bGbxzbCmaSmIInDwJZBisZ4nM5bjcRhswLjIRpffqaZR7H8mtFprLmVh1kumSzHnhP2PHwrmMyAzS2YZUBlp4e__PuKH_Z0I-wC-DIz7
CitedBy_id crossref_primary_10_1093_nar_gkt519
crossref_primary_10_3390_ijms20071704
crossref_primary_10_1093_bioinformatics_btx234
crossref_primary_10_2174_1574893618666230508104341
crossref_primary_10_1002_btpr_2712
crossref_primary_10_1016_j_gpb_2018_05_003
crossref_primary_10_1093_bib_bbad156
crossref_primary_10_1093_bioinformatics_btw142
crossref_primary_10_1093_nar_gkw101
crossref_primary_10_1186_s12862_020_01723_3
crossref_primary_10_1101_gr_164178_113
crossref_primary_10_1038_srep23934
crossref_primary_10_1007_s10142_023_01040_0
crossref_primary_10_1007_s10462_022_10283_5
crossref_primary_10_1016_j_ab_2018_03_025
crossref_primary_10_1093_bioinformatics_btaa519
crossref_primary_10_1186_s41065_016_0012_2
crossref_primary_10_1038_s41588_018_0134_8
crossref_primary_10_1186_s12976_020_00122_x
crossref_primary_10_1093_bib_bbv101
crossref_primary_10_1186_1752_0509_8_S5_S5
crossref_primary_10_1371_journal_pone_0035202
crossref_primary_10_1093_bioadv_vbad043
crossref_primary_10_1093_bioinformatics_btx105
crossref_primary_10_3389_fcimb_2023_1182567
crossref_primary_10_1186_gb_2013_14_7_r72
crossref_primary_10_1371_journal_pcbi_1003677
crossref_primary_10_1186_s13059_023_02955_4
crossref_primary_10_1371_journal_pcbi_1003711
crossref_primary_10_3390_genes12111689
crossref_primary_10_1109_TCBB_2021_3053608
crossref_primary_10_1038_s41588_021_00782_6
crossref_primary_10_1186_s12864_017_3934_9
crossref_primary_10_1016_j_ab_2021_114318
crossref_primary_10_1002_tpg2_20135
crossref_primary_10_1093_bioinformatics_btv208
crossref_primary_10_1016_j_knosys_2023_110492
crossref_primary_10_1007_s00521_020_04879_7
crossref_primary_10_1186_s12861_016_0106_0
crossref_primary_10_1186_s13104_021_05518_7
crossref_primary_10_1093_bioinformatics_btw203
crossref_primary_10_1093_bioinformatics_btab349
crossref_primary_10_1093_bib_bbad170
crossref_primary_10_1101_gr_199778_115
crossref_primary_10_1016_j_compbiomed_2024_108166
crossref_primary_10_1039_D0MO00031K
crossref_primary_10_1186_s12864_018_5335_0
crossref_primary_10_1371_journal_pcbi_1005249
crossref_primary_10_1371_journal_pcbi_1005403
crossref_primary_10_3389_fmolb_2021_673363
crossref_primary_10_1016_j_ajhg_2019_02_008
crossref_primary_10_1101_gr_190603_115
crossref_primary_10_1186_s12915_023_01596_0
crossref_primary_10_1101_gr_244251_118
crossref_primary_10_1186_1756_8935_8_8
crossref_primary_10_1016_j_ab_2020_113995
crossref_primary_10_1186_s13072_017_0152_2
crossref_primary_10_1371_journal_pone_0047836
crossref_primary_10_1080_15476286_2021_1940697
crossref_primary_10_1371_journal_pone_0185570
crossref_primary_10_1093_bib_bbx067
crossref_primary_10_1109_TCBB_2018_2819660
crossref_primary_10_3389_fcell_2020_00741
crossref_primary_10_1038_s41598_018_33321_1
crossref_primary_10_1186_s12859_021_04143_2
crossref_primary_10_1007_s10529_014_1523_4
crossref_primary_10_1007_s00330_022_09130_6
crossref_primary_10_1093_bioinformatics_btw552
crossref_primary_10_1101_gr_139360_112
crossref_primary_10_1016_j_ymeth_2014_10_008
crossref_primary_10_1016_j_tig_2012_09_007
crossref_primary_10_1093_bioinformatics_btx480
crossref_primary_10_1038_s41576_019_0122_6
crossref_primary_10_1093_nar_gkt188
crossref_primary_10_1093_bib_bbaa053
crossref_primary_10_1101_gr_139717_112
crossref_primary_10_1038_s41598_018_34420_9
crossref_primary_10_1109_ACCESS_2021_3062291
crossref_primary_10_1371_journal_pcbi_1004271
crossref_primary_10_1038_s41467_020_14853_5
crossref_primary_10_1093_nargab_lqab095
crossref_primary_10_1016_j_ab_2020_113905
crossref_primary_10_1016_j_pbi_2015_01_005
crossref_primary_10_1073_pnas_2215328119
crossref_primary_10_3390_genes3040651
crossref_primary_10_1002_term_2158
crossref_primary_10_1016_j_pt_2014_02_008
crossref_primary_10_3390_molecules28052284
crossref_primary_10_1371_journal_pone_0140557
crossref_primary_10_1016_j_gpb_2019_04_006
crossref_primary_10_3390_cells8111332
crossref_primary_10_1016_j_ymeth_2013_03_021
crossref_primary_10_1093_bioinformatics_btx679
crossref_primary_10_1080_15476286_2024_2315384
crossref_primary_10_1093_nar_gkv458
crossref_primary_10_1371_journal_pone_0174052
crossref_primary_10_1101_gr_268599_120
crossref_primary_10_1186_s12870_019_1693_2
crossref_primary_10_1371_journal_pone_0274338
crossref_primary_10_1016_j_gde_2016_12_007
crossref_primary_10_1101_gr_159608_113
crossref_primary_10_1093_bioinformatics_btw186
crossref_primary_10_2174_1574893616666211123094301
crossref_primary_10_1371_journal_pcbi_1006484
crossref_primary_10_1016_j_ab_2018_10_018
crossref_primary_10_1089_cmb_2020_0284
crossref_primary_10_1371_journal_pcbi_1005795
crossref_primary_10_1093_bioinformatics_bts028
crossref_primary_10_15302_J_QB_022_0322
crossref_primary_10_1186_1471_2164_13_S7_S11
crossref_primary_10_1016_j_compbiomed_2023_107242
crossref_primary_10_1016_j_neucom_2019_10_091
crossref_primary_10_1093_bib_bbz101
crossref_primary_10_1186_1471_2164_16_S7_S11
crossref_primary_10_1016_j_gpb_2013_04_002
crossref_primary_10_1101_gr_234633_118
crossref_primary_10_1186_s12859_017_1878_3
crossref_primary_10_1073_pnas_1808833115
crossref_primary_10_1002_adbi_202200232
crossref_primary_10_1038_s41467_022_32165_8
crossref_primary_10_1093_bib_bbt078
crossref_primary_10_1101_gr_173518_114
crossref_primary_10_1016_j_isci_2019_10_055
crossref_primary_10_7554_eLife_67403
crossref_primary_10_1101_gr_146506_112
crossref_primary_10_1109_TCBB_2017_2691325
crossref_primary_10_1038_s41467_017_01982_7
crossref_primary_10_1016_j_compbiolchem_2018_03_019
crossref_primary_10_1109_ACCESS_2023_3284464
crossref_primary_10_1186_s12918_017_0389_1
crossref_primary_10_1038_ng_3331
crossref_primary_10_1016_j_omtn_2019_04_019
crossref_primary_10_1007_s00285_013_0705_3
crossref_primary_10_1101_gr_169243_113
crossref_primary_10_37394_23208_2023_20_12
crossref_primary_10_1007_s12539_022_00503_5
crossref_primary_10_1371_journal_pcbi_1005720
crossref_primary_10_3390_insects12070591
crossref_primary_10_1038_srep32476
crossref_primary_10_1093_gbe_evu184
crossref_primary_10_3390_genes13040568
crossref_primary_10_1016_j_compbiolchem_2024_108077
crossref_primary_10_1038_ng_2713
crossref_primary_10_1371_journal_pcbi_1009376
crossref_primary_10_1038_s41598_017_03554_7
crossref_primary_10_1073_pnas_2212810119
crossref_primary_10_1093_nar_gkab122
crossref_primary_10_1371_journal_pone_0169249
crossref_primary_10_1146_annurev_genom_121719_010946
crossref_primary_10_3390_ijms22158067
crossref_primary_10_1093_bib_bby110
crossref_primary_10_3390_ijms23158221
crossref_primary_10_1109_TCBB_2022_3142019
crossref_primary_10_1186_gb_2013_14_10_r117
crossref_primary_10_1016_j_bbagrm_2019_194443
crossref_primary_10_1186_gb_2013_14_5_205
crossref_primary_10_3389_fsysb_2024_1402664
crossref_primary_10_1016_j_ab_2019_02_017
crossref_primary_10_3389_fgene_2021_798107
crossref_primary_10_3389_fgene_2019_01305
crossref_primary_10_1016_j_cell_2016_04_048
crossref_primary_10_1109_TCBB_2019_2909237
crossref_primary_10_1038_s41467_020_19921_4
crossref_primary_10_18632_oncotarget_14524
crossref_primary_10_1101_gr_132811_111
crossref_primary_10_1002_wdev_168
crossref_primary_10_1038_nature12753
crossref_primary_10_1080_15476286_2020_1734382
crossref_primary_10_1002_wsbm_1165
crossref_primary_10_1038_nbt_4138
crossref_primary_10_1242_dev_142554
crossref_primary_10_1038_s10038_024_01256_3
crossref_primary_10_1007_s00438_015_1078_7
Cites_doi 10.1126/science.1160930
10.1016/S0092-8674(03)01078-X
10.1101/gr.3642605
10.1371/journal.pcbi.1000173
10.1016/0092-8674(81)90413-X
10.1371/journal.pone.0006901
10.1093/bioinformatics/btn170
10.1101/sqb.1998.63.609
10.1016/j.cell.2008.05.024
10.1093/nar/gkm966
10.1093/bioinformatics/bti1053
10.1101/gr.716103
10.1093/bioinformatics/18.1.147
10.1101/gr.098657.109
10.1093/bioinformatics/17.suppl_1.S207
10.1007/978-1-60761-854-6_13
10.1093/bioinformatics/16.10.906
10.1038/nature07829
10.1146/annurev.neuro.31.060407.125631
10.1006/jmbi.2000.3519
10.1073/pnas.1530509100
10.1038/nature07730
10.1186/1471-2105-5-169
10.1016/S0896-6273(03)00365-9
10.1016/S0959-4388(97)80115-8
10.1523/JNEUROSCI.20-02-00709.2000
10.1145/130385.130401
10.1007/978-1-4757-2440-0
10.1126/science.1141319
10.1038/nature05295
10.1126/science.1124070
10.1126/science.281.5373.60
10.1101/gr.104471.109
10.1093/bioinformatics/btp278
10.1016/j.neuron.2010.06.006
10.1101/gr.085449.108
10.1016/j.cell.2005.10.042
10.1093/nar/gkm955
10.1074/jbc.M109.063032
10.1038/ng1966
10.1038/nature09033
10.1073/pnas.0400611101
10.1101/gr.6101007
10.1186/gb-2007-8-2-r24
10.1523/JNEUROSCI.13-07-03155.1993
10.1038/nmeth1068
10.1016/S0092-8674(04)00304-6
10.1093/nar/gkg108
10.1186/1471-2105-8-S10-S7
10.1093/bioinformatics/btg431
10.1371/journal.pcbi.1001020
10.1242/jcs.114.13.2363
10.1101/gad.9.21.2646
10.7551/mitpress/4057.001.0001
10.1111/j.1399-0004.2008.00967.x
10.1038/ng.759
10.1101/gr.817703
10.1101/gr.6929408
10.1038/ng1051
10.1038/ng.2007.55
10.1101/gr.3715005
10.1016/0092-8674(95)90136-1
10.1371/journal.pcbi.1001070
10.1371/journal.pbio.0030007
10.1016/S0959-437X(02)00323-4
10.1038/nature05874
10.1242/dev.01220
10.1093/bioinformatics/btl250
10.1093/nar/gkn660
10.1038/nrn874
10.1146/annurev-genom-082509-141651
10.1016/j.cell.2008.04.043
10.7551/mitpress/1130.003.0015
ContentType Journal Article
Copyright Copyright © 2011 by Cold Spring Harbor Laboratory Press 2011
Copyright_xml – notice: Copyright © 2011 by Cold Spring Harbor Laboratory Press 2011
DBID CGR
CUY
CVF
ECM
EIF
NPM
AAYXX
CITATION
7X8
7TM
8FD
FR3
P64
RC3
5PM
DOI 10.1101/gr.121905.111
DatabaseName Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
CrossRef
MEDLINE - Academic
Nucleic Acids Abstracts
Technology Research Database
Engineering Research Database
Biotechnology and BioEngineering Abstracts
Genetics Abstracts
PubMed Central (Full Participant titles)
DatabaseTitle MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
CrossRef
MEDLINE - Academic
Genetics Abstracts
Engineering Research Database
Technology Research Database
Nucleic Acids Abstracts
Biotechnology and BioEngineering Abstracts
DatabaseTitleList MEDLINE
Genetics Abstracts
CrossRef

Database_xml – sequence: 1
  dbid: ECM
  name: MEDLINE
  url: https://search.ebscohost.com/login.aspx?direct=true&db=cmedm&site=ehost-live
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Anatomy & Physiology
Chemistry
Biology
DocumentTitleAlternate Lee et al
EISSN 1549-5469
EndPage 2180
ExternalDocumentID 10_1101_gr_121905_111
21875935
Genre Research Support, U.S. Gov't, Non-P.H.S
Research Support, Non-U.S. Gov't
Journal Article
Research Support, N.I.H., Extramural
GrantInformation_xml – fundername: NINDS NIH HHS
  grantid: NS062972
– fundername: NINDS NIH HHS
  grantid: R01 NS062972
GroupedDBID ---
.GJ
18M
29H
2WC
39C
4.4
53G
5GY
5RE
5VS
AAYOK
AAZTW
ABDIX
ABDNZ
ACGFO
ACYGS
ADBBV
ADNWM
AEILP
AENEX
AI.
ALMA_UNASSIGNED_HOLDINGS
BAWUL
BTFSW
C1A
CGR
CS3
CUY
CVF
DIK
DU5
E3Z
EBS
ECM
EIF
EJD
F5P
FRP
GX1
H13
HYE
IH2
K-O
KQ8
MV1
NPM
R.V
RCX
RHF
RHI
RNS
RPM
RXW
SJN
TAE
TR2
VH1
W8F
WOQ
YKV
ZCG
ZGI
ZXP
AAYXX
ABRJW
CITATION
7X8
7TM
8FD
FR3
P64
RC3
5PM
ID FETCH-LOGICAL-c484t-3c80d6460732467c59fa67325c48218ee9555c10e5063e3cd900a60fa3d384d03
IEDL.DBID RPM
ISSN 1088-9051
IngestDate Tue Sep 17 21:21:32 EDT 2024
Fri Oct 25 22:52:59 EDT 2024
Fri Oct 25 11:21:06 EDT 2024
Thu Sep 12 16:29:39 EDT 2024
Tue Oct 15 23:42:28 EDT 2024
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 12
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c484t-3c80d6460732467c59fa67325c48218ee9555c10e5063e3cd900a60fa3d384d03
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ObjectType-Article-2
ObjectType-Feature-1
OpenAccessLink https://genome.cshlp.org/content/21/12/2167.full.pdf
PMID 21875935
PQID 907999466
PQPubID 23479
PageCount 14
ParticipantIDs pubmedcentral_primary_oai_pubmedcentral_nih_gov_3227105
proquest_miscellaneous_915482283
proquest_miscellaneous_907999466
crossref_primary_10_1101_gr_121905_111
pubmed_primary_21875935
PublicationCentury 2000
PublicationDate 2011-12-01
PublicationDateYYYYMMDD 2011-12-01
PublicationDate_xml – month: 12
  year: 2011
  text: 2011-12-01
  day: 01
PublicationDecade 2010
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Genome research
PublicationTitleAlternate Genome Res
PublicationYear 2011
Publisher Cold Spring Harbor Laboratory Press
Publisher_xml – name: Cold Spring Harbor Laboratory Press
References 16024819 - Genome Res. 2005 Aug;15(8):1034-50
20075146 - Genome Res. 2010 Mar;20(3):381-92
17558387 - Nat Methods. 2007 Aug;4(8):651-7
12670995 - Genome Res. 2003 Apr;13(4):533-43
11559745 - J Cell Sci. 2001 Jul;114(Pt 13):2363-73
21152003 - PLoS Comput Biol. 2010;6(12):e1001020
11928508 - Pac Symp Biocomput. 2002;:564-75
12094208 - Nat Rev Neurosci. 2002 Jul;3(7):517-30
18586746 - Bioinformatics. 2008 Jul 1;24(13):i6-14
18241223 - Clin Genet. 2008 Mar;73(3):212-26
19011757 - Cell Mol Life Sci. 2009 Mar;66(5):773-87
12100890 - Curr Opin Genet Dev. 2002 Aug;12(4):441-6
7590242 - Genes Dev. 1995 Nov 1;9(21):2646-58
16873509 - Bioinformatics. 2006 Jul 15;22(14):e472-80
19389732 - Bioinformatics. 2009 Aug 15;25(16):2126-33
18071029 - Genome Res. 2008 Feb;18(2):252-60
15961480 - Bioinformatics. 2005 Jun;21 Suppl 1:i369-77
6277502 - Cell. 1981 Dec;27(2 Pt 1):299-308
18086701 - Nucleic Acids Res. 2008 Jan;36(Database issue):D773-9
10384326 - Cold Spring Harb Symp Quant Biol. 1998;63:609-20
17324271 - Genome Biol. 2007;8(2):R24
18555785 - Cell. 2008 Jun 13;133(6):1106-17
19730735 - PLoS One. 2009;4(9):e6901
20438361 - Annu Rev Genomics Hum Genet. 2010;11:1-23
20827594 - Methods Mol Biol. 2010;674:213-23
16024817 - Genome Res. 2005 Aug;15(8):1051-60
10632600 - J Neurosci. 2000 Jan 15;20(2):709-21
18558867 - Annu Rev Neurosci. 2008;31:563-90
8548797 - Cell. 1995 Dec 29;83(7):1091-100
16556802 - Science. 2006 Apr 14;312(5771):276-9
21347314 - PLoS Comput Biol. 2011;7(2):e1001070
11473011 - Bioinformatics. 2001;17 Suppl 1:S207-14
17086198 - Nature. 2006 Nov 23;444(7118):499-502
18176564 - Nat Genet. 2008 Feb;40(2):158-60
11120680 - Bioinformatics. 2000 Oct;16(10):906-14
15511290 - BMC Bioinformatics. 2004 Oct 28;5:169
7687285 - J Neurosci. 1993 Jul;13(7):3155-72
18269701 - BMC Bioinformatics. 2007;8 Suppl 10:S7
18006571 - Nucleic Acids Res. 2008 Jan;36(Database issue):D102-6
15026577 - Proc Natl Acad Sci U S A. 2004 Mar 16;101(11):3851-6
7584402 - Proc Int Conf Intell Syst Mol Biol. 1994;2:28-36
20393465 - Nature. 2010 May 13;465(7295):182-7
20670838 - Neuron. 2010 Jul 29;67(2):321-34
16413481 - Cell. 2006 Jan 13;124(1):47-59
19141595 - Genome Res. 2009 Apr;19(4):644-56
19058033 - Mol Neurobiol. 2009 Feb;39(1):10-23
15630479 - PLoS Biol. 2005 Jan;3(1):e7
12426570 - Nat Genet. 2002 Dec;32(4):623-6
17540862 - Science. 2007 Jun 8;316(5830):1497-502
18585359 - Cell. 2008 Jun 27;133(7):1266-76
12848929 - Neuron. 2003 Jul 3;39(1):13-25
15201224 - Development. 2004 Jul;131(14):3319-31
21258342 - Nat Genet. 2011 Mar;43(3):264-8
10698627 - J Mol Biol. 2000 Mar 10;296(5):1205-14
9679020 - Science. 1998 Jul 3;281(5373):60-3
12883005 - Proc Natl Acad Sci U S A. 2003 Aug 5;100(16):9440-5
18842628 - Nucleic Acids Res. 2009 Jan;37(Database issue):D77-82
14990442 - Bioinformatics. 2004 Mar 1;20(4):467-76
17277777 - Nat Genet. 2007 Mar;39(3):311-8
12529307 - Genome Res. 2003 Jan;13(1):64-72
15084257 - Cell. 2004 Apr 16;117(2):185-98
20363979 - Genome Res. 2010 May;20(5):565-77
18787134 - Science. 2008 Oct 17;322(5900):434-8
19887448 - J Biol Chem. 2010 Jan 8;285(2):1393-403
11836223 - Bioinformatics. 2002 Jan;18(1):147-59
19295514 - Nature. 2009 May 7;459(7243):108-12
18974822 - PLoS Comput Biol. 2008 Oct;4(10):e1000173
17571346 - Nature. 2007 Jun 14;447(7146):799-816
14744435 - Cell. 2004 Jan 23;116(2):247-57
19212405 - Nature. 2009 Feb 12;457(7231):854-8
17620451 - Genome Res. 2007 Aug;17(8):1170-7
9039799 - Curr Opin Neurobiol. 1997 Feb;7(1):13-20
12520026 - Nucleic Acids Res. 2003 Jan 1;31(1):374-8
(2021111811085639000_21.12.2167.79) 2000; 20
(2021111811085639000_21.12.2167.41) 2002; 7
2021111811085639000_21.12.2167.60
2021111811085639000_21.12.2167.62
2021111811085639000_21.12.2167.63
2021111811085639000_21.12.2167.20
2021111811085639000_21.12.2167.21
2021111811085639000_21.12.2167.65
2021111811085639000_21.12.2167.22
2021111811085639000_21.12.2167.23
2021111811085639000_21.12.2167.67
2021111811085639000_21.12.2167.24
2021111811085639000_21.12.2167.68
2021111811085639000_21.12.2167.25
2021111811085639000_21.12.2167.69
2021111811085639000_21.12.2167.26
(2021111811085639000_21.12.2167.61) 2010; 674
2021111811085639000_21.12.2167.27
2021111811085639000_21.12.2167.28
2021111811085639000_21.12.2167.29
2021111811085639000_21.12.2167.70
2021111811085639000_21.12.2167.72
2021111811085639000_21.12.2167.73
2021111811085639000_21.12.2167.30
2021111811085639000_21.12.2167.74
2021111811085639000_21.12.2167.31
2021111811085639000_21.12.2167.75
2021111811085639000_21.12.2167.32
2021111811085639000_21.12.2167.76
2021111811085639000_21.12.2167.33
2021111811085639000_21.12.2167.77
2021111811085639000_21.12.2167.34
2021111811085639000_21.12.2167.78
2021111811085639000_21.12.2167.35
2021111811085639000_21.12.2167.36
(2021111811085639000_21.12.2167.1) 1994; 2
2021111811085639000_21.12.2167.37
2021111811085639000_21.12.2167.6
2021111811085639000_21.12.2167.39
2021111811085639000_21.12.2167.7
(2021111811085639000_21.12.2167.10) 1993; 13
2021111811085639000_21.12.2167.8
2021111811085639000_21.12.2167.9
2021111811085639000_21.12.2167.2
2021111811085639000_21.12.2167.3
2021111811085639000_21.12.2167.4
2021111811085639000_21.12.2167.5
(2021111811085639000_21.12.2167.38) 2007; 8
2021111811085639000_21.12.2167.40
(2021111811085639000_21.12.2167.64) 2006; 7
2021111811085639000_21.12.2167.42
2021111811085639000_21.12.2167.43
2021111811085639000_21.12.2167.44
2021111811085639000_21.12.2167.46
2021111811085639000_21.12.2167.47
(2021111811085639000_21.12.2167.71) 2008; 66
2021111811085639000_21.12.2167.48
2021111811085639000_21.12.2167.49
(2021111811085639000_21.12.2167.66) 2007; 8
(2021111811085639000_21.12.2167.45) 2008; 39
2021111811085639000_21.12.2167.50
2021111811085639000_21.12.2167.51
2021111811085639000_21.12.2167.52
2021111811085639000_21.12.2167.53
2021111811085639000_21.12.2167.54
2021111811085639000_21.12.2167.11
2021111811085639000_21.12.2167.55
(2021111811085639000_21.12.2167.12) 2001; 114
2021111811085639000_21.12.2167.56
2021111811085639000_21.12.2167.13
2021111811085639000_21.12.2167.57
2021111811085639000_21.12.2167.14
2021111811085639000_21.12.2167.58
2021111811085639000_21.12.2167.15
2021111811085639000_21.12.2167.59
2021111811085639000_21.12.2167.16
2021111811085639000_21.12.2167.17
2021111811085639000_21.12.2167.18
2021111811085639000_21.12.2167.19
References_xml – ident: 2021111811085639000_21.12.2167.77
  doi: 10.1126/science.1160930
– ident: 2021111811085639000_21.12.2167.32
  doi: 10.1016/S0092-8674(03)01078-X
– ident: 2021111811085639000_21.12.2167.37
  doi: 10.1101/gr.3642605
– ident: 2021111811085639000_21.12.2167.4
  doi: 10.1371/journal.pcbi.1000173
– ident: 2021111811085639000_21.12.2167.2
  doi: 10.1016/0092-8674(81)90413-X
– ident: 2021111811085639000_21.12.2167.43
  doi: 10.1371/journal.pone.0006901
– ident: 2021111811085639000_21.12.2167.67
  doi: 10.1093/bioinformatics/btn170
– ident: 2021111811085639000_21.12.2167.44
  doi: 10.1101/sqb.1998.63.609
– ident: 2021111811085639000_21.12.2167.5
  doi: 10.1016/j.cell.2008.05.024
– ident: 2021111811085639000_21.12.2167.34
  doi: 10.1093/nar/gkm966
– ident: 2021111811085639000_21.12.2167.57
  doi: 10.1093/bioinformatics/bti1053
– ident: 2021111811085639000_21.12.2167.22
  doi: 10.1101/gr.716103
– ident: 2021111811085639000_21.12.2167.33
  doi: 10.1093/bioinformatics/18.1.147
– ident: 2021111811085639000_21.12.2167.51
  doi: 10.1101/gr.098657.109
– ident: 2021111811085639000_21.12.2167.54
  doi: 10.1093/bioinformatics/17.suppl_1.S207
– volume: 674
  start-page: 213
  volume-title: Computational biology of transcription factor binding
  year: 2010
  ident: 2021111811085639000_21.12.2167.61
  article-title: Kernel-based identification of regulatory modules
  doi: 10.1007/978-1-60761-854-6_13
– volume: 66
  start-page: 773
  year: 2008
  ident: 2021111811085639000_21.12.2167.71
  article-title: The role of the ZEB family of transcription factors in development and disease
  publication-title: Cell Mol Life Sci
– ident: 2021111811085639000_21.12.2167.21
  doi: 10.1093/bioinformatics/16.10.906
– ident: 2021111811085639000_21.12.2167.27
  doi: 10.1038/nature07829
– ident: 2021111811085639000_21.12.2167.19
  doi: 10.1146/annurev.neuro.31.060407.125631
– ident: 2021111811085639000_21.12.2167.28
  doi: 10.1006/jmbi.2000.3519
– ident: 2021111811085639000_21.12.2167.68
  doi: 10.1073/pnas.1530509100
– ident: 2021111811085639000_21.12.2167.74
  doi: 10.1038/nature07730
– ident: 2021111811085639000_21.12.2167.50
  doi: 10.1186/1471-2105-5-169
– ident: 2021111811085639000_21.12.2167.59
  doi: 10.1016/S0896-6273(03)00365-9
– ident: 2021111811085639000_21.12.2167.40
  doi: 10.1016/S0959-4388(97)80115-8
– volume: 20
  start-page: 709
  year: 2000
  ident: 2021111811085639000_21.12.2167.79
  article-title: A highly conserved enhancer in the Dlx5/Dlx6 intergenic region is the site of cross-regulatory interactions between Dlx genes in the embryonic forebrain
  publication-title: J Neurosci
  doi: 10.1523/JNEUROSCI.20-02-00709.2000
– ident: 2021111811085639000_21.12.2167.8
  doi: 10.1145/130385.130401
– ident: 2021111811085639000_21.12.2167.72
  doi: 10.1007/978-1-4757-2440-0
– ident: 2021111811085639000_21.12.2167.31
  doi: 10.1126/science.1141319
– ident: 2021111811085639000_21.12.2167.56
  doi: 10.1038/nature05295
– ident: 2021111811085639000_21.12.2167.18
  doi: 10.1126/science.1124070
– ident: 2021111811085639000_21.12.2167.7
  doi: 10.1126/science.281.5373.60
– ident: 2021111811085639000_21.12.2167.23
  doi: 10.1101/gr.104471.109
– ident: 2021111811085639000_21.12.2167.62
  doi: 10.1093/bioinformatics/btp278
– ident: 2021111811085639000_21.12.2167.14
  doi: 10.1016/j.neuron.2010.06.006
– ident: 2021111811085639000_21.12.2167.49
  doi: 10.1101/gr.085449.108
– ident: 2021111811085639000_21.12.2167.25
  doi: 10.1016/j.cell.2005.10.042
– ident: 2021111811085639000_21.12.2167.9
  doi: 10.1093/nar/gkm955
– ident: 2021111811085639000_21.12.2167.20
  doi: 10.1074/jbc.M109.063032
– ident: 2021111811085639000_21.12.2167.26
  doi: 10.1038/ng1966
– ident: 2021111811085639000_21.12.2167.36
  doi: 10.1038/nature09033
– ident: 2021111811085639000_21.12.2167.17
  doi: 10.1073/pnas.0400611101
– ident: 2021111811085639000_21.12.2167.55
  doi: 10.1101/gr.6101007
– volume: 39
  start-page: 10
  year: 2008
  ident: 2021111811085639000_21.12.2167.45
  article-title: Nuclear factor one transcription factors in CNS development
  publication-title: Mol Neurobiol
– ident: 2021111811085639000_21.12.2167.24
  doi: 10.1186/gb-2007-8-2-r24
– volume: 2
  start-page: 28
  year: 1994
  ident: 2021111811085639000_21.12.2167.1
  article-title: Fitting a mixture model by expectation maximization to discover motifs in biopolymers
  publication-title: Proc Int Conf Intell Syst Mol Biol
– volume: 8
  start-page: 1519
  year: 2007
  ident: 2021111811085639000_21.12.2167.38
  article-title: An interior-point method for large-scale l1-regularized logistic regression
  publication-title: J Mach Learn Res
– volume: 13
  start-page: 3155
  year: 1993
  ident: 2021111811085639000_21.12.2167.10
  article-title: Spatially restricted expression of Dlx-1, Dlx-2 (Tes-1), Gbx-2, and Wnt-3 in the embryonic day 12.5 mouse forebrain defines potential transverse and longitudinal segmental boundaries
  publication-title: J Neurosci
  doi: 10.1523/JNEUROSCI.13-07-03155.1993
– ident: 2021111811085639000_21.12.2167.58
  doi: 10.1038/nmeth1068
– ident: 2021111811085639000_21.12.2167.3
  doi: 10.1016/S0092-8674(04)00304-6
– ident: 2021111811085639000_21.12.2167.47
  doi: 10.1093/nar/gkg108
– volume: 7
  start-page: 564
  year: 2002
  ident: 2021111811085639000_21.12.2167.41
  article-title: The spectrum kernel: A string kernel for SVM protein classification
  publication-title: Pac Symp Biocomput
– volume: 8
  start-page: S7
  year: 2007
  ident: 2021111811085639000_21.12.2167.66
  article-title: Accurate splice site prediction using support vector machines
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-8-S10-S7
– ident: 2021111811085639000_21.12.2167.42
  doi: 10.1093/bioinformatics/btg431
– volume: 7
  start-page: 1531
  year: 2006
  ident: 2021111811085639000_21.12.2167.64
  article-title: Large scale multiple kernel learning
  publication-title: J Mach Learn Res
– ident: 2021111811085639000_21.12.2167.69
  doi: 10.1371/journal.pcbi.1001020
– volume: 114
  start-page: 2363
  year: 2001
  ident: 2021111811085639000_21.12.2167.12
  article-title: P300/CBP proteins: HATs for transcriptional bridges and scaffolds
  publication-title: J Cell Sci
  doi: 10.1242/jcs.114.13.2363
– ident: 2021111811085639000_21.12.2167.46
  doi: 10.1101/gad.9.21.2646
– ident: 2021111811085639000_21.12.2167.60
  doi: 10.7551/mitpress/4057.001.0001
– ident: 2021111811085639000_21.12.2167.75
  doi: 10.1111/j.1399-0004.2008.00967.x
– ident: 2021111811085639000_21.12.2167.30
  doi: 10.1038/ng.759
– ident: 2021111811085639000_21.12.2167.15
  doi: 10.1101/gr.817703
– ident: 2021111811085639000_21.12.2167.48
  doi: 10.1101/gr.6929408
– ident: 2021111811085639000_21.12.2167.11
  doi: 10.1038/ng1051
– ident: 2021111811085639000_21.12.2167.73
  doi: 10.1038/ng.2007.55
– ident: 2021111811085639000_21.12.2167.63
  doi: 10.1101/gr.3715005
– ident: 2021111811085639000_21.12.2167.70
  doi: 10.1016/0092-8674(95)90136-1
– ident: 2021111811085639000_21.12.2167.35
  doi: 10.1371/journal.pcbi.1001070
– ident: 2021111811085639000_21.12.2167.78
  doi: 10.1371/journal.pbio.0030007
– ident: 2021111811085639000_21.12.2167.76
  doi: 10.1016/S0959-437X(02)00323-4
– ident: 2021111811085639000_21.12.2167.16
  doi: 10.1038/nature05874
– ident: 2021111811085639000_21.12.2167.39
  doi: 10.1242/dev.01220
– ident: 2021111811085639000_21.12.2167.65
  doi: 10.1093/bioinformatics/btl250
– ident: 2021111811085639000_21.12.2167.52
  doi: 10.1093/nar/gkn660
– ident: 2021111811085639000_21.12.2167.6
  doi: 10.1038/nrn874
– ident: 2021111811085639000_21.12.2167.53
  doi: 10.1146/annurev-genom-082509-141651
– ident: 2021111811085639000_21.12.2167.13
  doi: 10.1016/j.cell.2008.04.043
– ident: 2021111811085639000_21.12.2167.29
  doi: 10.7551/mitpress/1130.003.0015
SSID ssj0003488
Score 2.5289013
Snippet Accurately predicting regulatory sequences and enhancers in entire genomes is an important but difficult problem, especially in large vertebrate genomes. With...
SourceID pubmedcentral
proquest
crossref
pubmed
SourceType Open Access Repository
Aggregation Database
Index Database
StartPage 2167
SubjectTerms Animals
Cerebral Cortex - cytology
Cerebral Cortex - metabolism
CREB-Binding Protein
Genome - physiology
Genome-Wide Association Study - methods
Method
Mice
Neurons - cytology
Neurons - metabolism
Oligonucleotide Array Sequence Analysis - methods
Organ Specificity - physiology
Response Elements - physiology
Sequence Analysis, DNA
Title Discriminative prediction of mammalian enhancers from DNA sequence
URI https://www.ncbi.nlm.nih.gov/pubmed/21875935
https://search.proquest.com/docview/907999466
https://search.proquest.com/docview/915482283
https://pubmed.ncbi.nlm.nih.gov/PMC3227105
Volume 21
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT-MwEB5RJLRcEK-F8pIPK26hTmK7ybG0VL2AkBYkbpHjOFBpk1YpPfDvmXHqCliJAzcrdpRkZjKesb_5DPBHFWVYFEYHSVSGbpsx0Hk_CoxUFicwk0clFSdP_vbvnpLRDdHkSF8L40D7Jp9e1f-qq3r64rCV88r0PE6sd387RCPEiVH2OtDB2NCn6Cv3G4ukrX9DEyDyqTWxZth7bohMAS-So3A0wBitp-6ktw9z0n-B5le85IcJaLwLO6vIkQ3aN9yDDVvvw8Ggxqy5emOXzGE53SL5Pmxd-9avoT_R7QCuR1NyEgR-ISfH5g3t0pBm2Kxkla4qt-jBbP1CttAsGBWfsNHdgHnE9SE8jm8ehpNgdYZCYEQiXoPYJLxQQuGfHKFPNDIttcK2xG78fGtTKaUJuZUYq9jYFCnnWvFSx0WciILHv2GzntX2GJjRWkTcxjIpldAizTXXKF9jI5v2cxt14dJLMZu3VBmZSzF4mD03WSt5yji6wLyMM5QA7VDo2s6WiwwzdQxYhVLfDKEcizh7unDUamX9MK_OLvQ_6Ws9gKi0P_eghTlK7ZVFnfz4zlPYdqvNDuhyBpuvzdKeQ2dRLC-ceb4DIpPpHg
link.rule.ids 230,315,729,782,786,887,27935,27936,53803,53805
linkProvider National Library of Medicine
linkToHtml http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3PT9swFH4aTBNcYIMxyo_NB8Qt1ElsNzmWFtRpUCHBpN0ix3Gg0pJWKT3w3_OeU1cwpB24RbGjRP5ent-zv_cZ4EQVZVgURgdJVIZumzHQeS8KjFQWJzCTRyUVJ49ue-M_yfCCZHKkr4VxpH2TT87qv9VZPXlw3MpZZbqeJ9a9uR6gEeLEKLtr8BH_Vx77JH3pgGORtBVwaAQkP7WS1gy79w3JKeBNchVOCBjj9dSd9fZiVnoTav7LmHwxBV1uv_PjP8PWMuZk_bb5C3yw9Q7s9mvMt6sndsocC9Qtr-_Ap3N_tTHwZ8HtwvlwQu6FaDPkHtmsof0dwpRNS1bpqnLLJczWD2RFzZxR2QobjvvMc7W_wu_Li7vBKFievhAYkYjHIDYJL5RQ6AMi9KZGpqVWeC2xGYfN2lRKaUJuJUY5NjZFyrlWvNRxESei4PEerNfT2u4DM1qLiNtYJqUSWqS55hpxMTayaS-3UQdO_ehns1ZkI3PJCQ-z-yZrEaNcpQPMY5PhCNDehq7tdDHPMMfHUFco9Z8ulJ2R2k8HvrVorl7mzaADvVc4rzqQCPfrFoTXiXEv4Tx495M_YGN0d32VXf0c_zqETbdm7egyR7D-2CzsMazNi8V3Z-LPxP3-nQ
linkToPdf http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3PT9swFH4aTPtxGQzY1gHDh4lbiJPYbnIsLRVoUCFtk3aLHNuBSktapfSw_37vOU0FQ9oBblbiKIq_l-f37M_fA_iqbBlZa3SQxmXktxkDXfTjwEjlcAIzRVzS4eTz7_3Jr3R0RjI561JfnrRviulJ_bs6qae3nls5r0zY8cTC66shGiFOjDKc2zLcgJf4z3LZJeorJ5yItD0Fh4ZAElRrec0ovGlIUgEvkrvwYsAYs2e-3tu9melRuPkva_LeNDTeesYHbMO7VezJBm2X9_DC1TuwO6gx767-sGPm2aB-mX0HXp12rTfDribcLpyOpuRmiD5DbpLNG9rnIWzZrGSVriq_bMJcfUvW1CwYHV9ho8mAdZztPfg5PvsxPA9WVRgCI1JxFyQm5VYJhb4gRq9qZFZqhW2Jt3HonMuklCbiTmK04xJjM8614qVObJIKy5MPsFnPavcJmNFaxNwlMi2V0CIrNNeIjXGxy_qFi3tw3CGQz1uxjdwnKTzKb5q8RY1ylh6wDp8cR4D2OHTtZstFjrk-hrxCqf90oSyNVH968LFFdP2yzhR60H-A9boDiXE_vIMQe1HuFaSfn_zkEby-Ho3zy4vJt31465euPWvmADbvmqU7hI2FXX7xVv4X3ewBLA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Discriminative+prediction+of+mammalian+enhancers+from+DNA+sequence&rft.jtitle=Genome+research&rft.au=Lee%2C+Dongwon&rft.au=Karchin%2C+Rachel&rft.au=Beer%2C+Michael+A&rft.date=2011-12-01&rft.issn=1088-9051&rft.volume=21&rft.issue=12&rft.spage=2167&rft.epage=2180&rft_id=info:doi/10.1101%2Fgr.121905.111&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1088-9051&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1088-9051&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1088-9051&client=summon