An alternative view of protein fold space

Comparing and subsequently classifying protein structures information has received significant attention concurrent with the increase in the number of experimentally derived 3‐dimensional structures. Classification schemes have focused on biological function found within protein domains and on struc...

Full description

Saved in:
Bibliographic Details
Published in:Proteins, structure, function, and bioinformatics Vol. 38; no. 3; pp. 247 - 260
Main Authors: Shindyalov, Ilya N., Bourne, Philip E.
Format: Journal Article
Language:English
Published: New York John Wiley & Sons, Inc 15-02-2000
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Comparing and subsequently classifying protein structures information has received significant attention concurrent with the increase in the number of experimentally derived 3‐dimensional structures. Classification schemes have focused on biological function found within protein domains and on structure classification based on topology. Here an alternative view is presented that groups substructures. Substructures are long (50–150 residue) highly repetitive near‐contiguous pieces of polypeptide chain that occur frequently in a set of proteins from the PDB defined as structurally non‐redundant over the complete polypeptide chain. The substructure classification is based on a previously reported Combinatorial Extension (CE) algorithm that provides a significantly different set of structure alignments than those previously described, having, for example, only a 40% overlap with FSSP. Qualitatively the algorithm provides longer contiguous aligned segments at the price of a slightly higher root‐mean‐square deviation (rmsd). Clustering these alignments gives a discreet and highly repetitive set of substructures not detectable by sequence similarity alone. In some cases different substructures represent all or different parts of well known folds indicative of the Russian doll effect—the continuity of protein fold space. In other cases they fall into different structure and functional classifications. It is too early to determine whether these newly classified substructures represent new insights into the evolution of a structural framework important to many proteins. What is apparent from on‐going work is that these substructures have the potential to be useful probes in finding remote sequence homology and in structure prediction studies. The characteristics of the complete all‐by‐all comparison of the polypeptide chains present in the PDB and details of the filtering procedure by pair‐wise structure alignment that led to the emergent substructure gallery are discussed. Substructure classification, alignments, and tools to analyze them are available at http://cl.sdsc.edu/ce.html. Proteins 2000;38:247–260. © 2000 Wiley‐Liss, Inc.
Bibliography:ark:/67375/WNG-CNWHN8WJ-T
The substructure gallery specified in this article can be reviewed at http://cl.sdsc.edu/ce.html/.
ArticleID:PROT2
istex:7958B75106EBA78197A9A7E16BEBB94DDC4FBC8D
National Science Foundations - No. DBI 9630339; No. DBI 9808706
http://cl.sdsc.edu/ce.html/
The substructure gallery specified in this article can be reviewed at
.
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0887-3585
1097-0134
DOI:10.1002/(SICI)1097-0134(20000215)38:3<247::AID-PROT2>3.0.CO;2-T