Generation and Initial Analysis of More Than 15,000 Full-Length Human and Mouse cDNA Sequences

The National Institutes of Health Mammalian Gene Collection (MGC) Program is a multiinstitutional effort to identify and sequence a cDNA clone containing a complete ORF for each human and mouse gene. ESTs were generated from libraries enriched for full-length cDNAs and analyzed to identify candidate...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings of the National Academy of Sciences - PNAS Vol. 99; no. 26; pp. 16899 - 16903
Main Authors: Strausberg, Robert L, Feingold, Elise A, Grouse, Lynette H, Derge, Jeffery G, Klausner, Richard D, Collins, Francis S, Wagner, Lukas, Shenmen, Carolyn M, Schuler, Gregory D, Altschul, Stephen F, Zeeberg, Barry, Buetow, Kenneth H, Schaefer, Carl F, Bhat, Narayan K, Hopkins, Ralph F, Jordan, Heather, Moore, Troy, Max, Steve I, Wang, Jun, Hsieh, Florence, Diatchenko, Luda, Marusina, Kate, Farmer, Andrew A, Rubin, Gerald M, Hong, Ling, Stapleton, Mark, Soares, M Bento, Bonaldo, Maria F, Casavant, Tom L, Scheetz, Todd E, Brownstein, Michael J, Usdin, Ted B, Toshiyuki, Shiraki, Carninci, Piero, Prange, Christa, Raha, Sam S, Loquellano, Naomi A, Peters, Garrick J, Abramson, Rick D, Mullahy, Sara J, Bosak, Stephanie A, McEwan, Paul J, McKernan, Kevin J, Malek, Joel A, Gunaratne, Preethi H, Richards, Stephen, Worley, Kim C, Hale, Sarah, Garcia, Angela M, Gay, Laura J, Hulyk, Stephen W, Villalon, Debbie K, Muzny, Donna M, Sodergren, Erica J, Lu, Xiuhua, Gibbs, Richard A, Fahey, Jessica, Helton, Erin, Ketteman, Mark, Madan, Anuradha, Rodrigues, Stephanie, Sanchez, Amy, Whiting, Michelle, Madan, Anup, Young, Alice C, Shevchenko, Yuriy, Bouffard, Gerard G, Blakesley, Robert W, Touchman, Jeffrey W, Green, Eric D, Dickson, Mark C, Rodriguez, Alex C, Grimwood, Jane, Schmutz, Jeremy, Myers, Richard M, Butterfield, Yaron S N, Krzywinski, Martin I, Skalska, Ursula, Smailus, Duane E, Schnerch, Angelique, Schein, Jacqueline E, Jones, Steven J M, Marra, Marco A
Format: Journal Article
Language:English
Published: United States National Academy of Sciences 24-12-2002
National Acad Sciences
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The National Institutes of Health Mammalian Gene Collection (MGC) Program is a multiinstitutional effort to identify and sequence a cDNA clone containing a complete ORF for each human and mouse gene. ESTs were generated from libraries enriched for full-length cDNAs and analyzed to identify candidate full-ORF clones, which then were sequenced to high accuracy. The MGC has currently sequenced and verified the full ORF for a nonredundant set of >9,000 human and >6,000 mouse genes. Candidate full-ORF clones for an additional 7,800 human and 3,500 mouse genes also have been identified. All MGC sequences and clones are available without restriction through public databases and clone distribution networks (see http://mgc.nci.nih.gov).
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ObjectType-Article-1
ObjectType-Feature-2
aNational Cancer Institute, NIH, 31 Center Drive, Bethesda, MD 20892-2580; cNational Human Genome Research Institute, NIH, 31 Center Drive, Bethesda, MD 20892-2580; eNational Center for Biotechnology Information, National Library of Medicine, NIH, Building 38A, Bethesda, MD 20894; fNational Cancer Institute, Center for Bioinformatics, 6116 Executive Boulevard, Rockville, MD 20892; dScience Applications International Corporation (SAIC)–Frederick Inc., National Cancer Institute–Frederick, Frederick, MD 21702-1201; gInvitrogen Corporation, 1600 Faraday Avenue, Carlsbad, CA 92008; hBD Biosciences CLONTECH, 1020 East Meadow Circle, Palo Alto, CA 94303; iDepartment of Molecular and Cell Biology and the Howard Hughes Medical Institute, University of California, Berkeley, CA 94720-3200; jUniversity of Iowa, 451 Eckstein Medical Research Building, Iowa City, IA 52242; kLaboratory of Cell Biology, National Institute of Mental Health, NIH, Bethesda, MD 20892; lGenome Science Laboratory, RIKEN Genomic Science Laboratory, 2–1 Hirosawa, Wako, Saitama 351-0198, Japan; mThe I.M.A.G.E. Consortium, Biology and Biotechnology Research Program, Lawrence Livermore National Laboratory, 7000 East Avenue, L448, Livermore, CA 94550; nIncyte Genomics, Inc., 3160 Porter Drive, Palo Alto, CA 94304; oAgencourt Bioscience Corporation, 100 Cummings Center, Suite 107J, Beverly, MA 01915; pBaylor Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030; qInstitute for Systems Biology, 1441 North 34th Street, Seattle, WA 98103-8904; rNIH Intramual Sequencing Center, 8717 Grovemont Circle, Gaithersburg, MD 20877; sStanford Human Genome Center, Department of Genetics, Stanford University School of Medicine, 975 California Ave, Palo Alto, CA 94304; and tUniversity of British Columbia Genome Sciences Centre, British Columbia Cancer Agency, 600 West 10th Avenue, Vancouver, BC, Canada V5Z 4E6.
Data deposition: All MGC sequences have been deposited in the GenBank database (accession nos. can be found in Table 1, which is published as supporting information on the PNAS web site, www.pnas.org) and can be accessed through the MGC web site (http://mgc.nci.nih.gov).
Contributed by Francis S. Collins
bTo whom correspondence should be addressed at: National Cancer Institute, 31 Center Drive, Room 10A07, Bethesda, MD 20892-2580. E-mail: rls@nih.gov.
MGC Program Team. Scientific Leadership and Management: Robert L. Strausbergab, Elise A. Feingoldc, Lynette H. Grousea, Jeffery G. Derged, Richard D. Klausnera, and Francis S. Collinsc; Bioinformatics for Clone Selection and Characterization and the MGC Web Site: Lukas Wagnere, Carolyn M. Shenmene, Gregory D. Schulere, Stephen F. Altschule, Barry Zeeberge, Kenneth H. Buetowf, and Carl F. Schaeferf; mRNA Preparation: Narayan K. Bhatd, Ralph F. Hopkinsd; cDNA Library Preparation: Heather Jordang, Troy Mooreg, Steve I. Maxg, Jun Wangg, Florence Hsiehh, Luda Diatchenkoh, Kate Marusinah, Andrew A. Farmerh, Gerald M. Rubini, Ling Hongi, Mark Stapletoni, M. Bento Soaresj, Maria F. Bonaldoj, Tom L. Casavantj, Todd E. Scheetzj, Michael J. Brownsteink, Ted B. Usdink, Shiraki Toshiyukil, and Piero Carnincil; cDNA Clone Management: Christa Prangem; EST Sequencing: Sam S. Rahan, Naomi A. Loquellanon, Garrick J. Petersn, Rick D. Abramsonn, Sara J. Mullahyn, Stephanie A. Bosako, Paul J. McEwano, Kevin J. McKernano, and Joel A. Maleko; cDNA Full-Insert Sequencing: Preethi H. Gunaratnep, Stephen Richardsp, Kim C. Worleyp, Sarah Halep, Angela M. Garciap, Laura J. Gayp, Stephen W. Hulykp, Debbie K. Villalonp, Donna M. Muznyp, Erica J. Sodergrenp, Xiuhua Lup, Richard A. Gibbsp, Jessica Faheyq, Erin Heltonq, Mark Kettemanq, Anuradha Madanq, Stephanie Rodriguesq, Amy Sanchezq, Michelle Whitingq, Anup Madanq, Alice C. Youngr, Yuriy Shevchenkor, Gerard G. Bouffardr, Robert W. Blakesleyr, Jeffrey W. Touchmanr, Eric D. Greenr, Mark C. Dicksons, Alex C. Rodriguezs, Jane Grimwoods, Jeremy Schmutzs, Richard M. Myerss, Yaron S. N. Butterfieldt, Martin I. Krzywinskit, Ursula Skalskat, Duane E. Smailust, Angelique Schnercht, Jacqueline E. Scheint, Steven J. M. Jonest, and Marco A. Marrat.
ISSN:0027-8424
1091-6490
DOI:10.1073/pnas.242603899