GenBank and PubMed: How connected are they?
GenBank is a public repository of all publicly available molecular sequence data from a range of sources. In addition to relevant metadata (e.g., sequence description, source organism and taxonomy), publication information is recorded in the GenBank data file. The identification of literature associ...
Saved in:
Published in: | BMC research notes Vol. 2; no. 1; p. 101 |
---|---|
Main Authors: | , , |
Format: | Journal Article |
Language: | English |
Published: |
England
BioMed Central Ltd
09-06-2009
BioMed Central BMC |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | GenBank is a public repository of all publicly available molecular sequence data from a range of sources. In addition to relevant metadata (e.g., sequence description, source organism and taxonomy), publication information is recorded in the GenBank data file. The identification of literature associated with a given molecular sequence may be an essential first step in developing research hypotheses. Although many of the publications associated with GenBank records may not be linked into or part of complementary literature databases (e.g., PubMed), GenBank records associated with literature indexed in Medline are identifiable as they contain PubMed identifiers (PMIDs).
Here we show that an analysis of 87,116,501 GenBank sequence files reveals that 42% are associated with a publication or patent. Of these, 71% are associated with PMIDs, and can therefore be linked to a citation record in the PubMed database. The remaining (29%) of publication-associated GenBank entries either do not have PMIDs or cite a publication that is not currently indexed by PubMed. We also identify the journal titles that are linked through citations in the GenBank files to the largest number of sequences.
Our analysis suggests that GenBank contains molecular sequences from a range of disciplines beyond biomedicine, the initial scope of PubMed. The findings thus suggest opportunities to develop mechanisms for integrating biological knowledge beyond the biomedical field. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 1756-0500 1756-0500 |
DOI: | 10.1186/1756-0500-2-101 |