Using the CATH domain database to assign structures and functions to the genome sequences

The CATH database of protein structures contains approximately 18000 domains organized according to their (C)lass, (A)rchitecture, (T)opology and (H)omologous superfamily. Relationships between evolutionary related structures (homologues) within the database have been used to test the sensitivity of...

Full description

Saved in:
Bibliographic Details
Published in:Biochemical Society transactions Vol. 28; no. 2; p. 269
Main Authors: Pearl, F, Todd, A E, Bray, J E, Martin, A C, Salamov, A A, Suwa, M, Swindells, M B, Thornton, J M, Orengo, C A
Format: Journal Article
Language:English
Published: England 01-02-2000
Subjects:
Online Access:Get more information
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The CATH database of protein structures contains approximately 18000 domains organized according to their (C)lass, (A)rchitecture, (T)opology and (H)omologous superfamily. Relationships between evolutionary related structures (homologues) within the database have been used to test the sensitivity of various sequence search methods in order to identify relatives in Genbank and other sequence databases. Subsequent application of the most sensitive and efficient algorithms, gapped blast and the profile based method, Position Specific Iterated Basic Local Alignment Tool (PSI-BLAST), could be used to assign structural data to between 22 and 36 % of microbial genomes in order to improve functional annotation and enhance understanding of biological mechanism. However, on a cautionary note, an analysis of functional conservation within fold groups and homologous superfamilies in the CATH database, revealed that whilst function was conserved in nearly 55% of enzyme families, function had diverged considerably, in some highly populated families. In these families, functional properties should be inherited far more cautiously and the probable effects of substitutions in key functional residues carefully assessed.
ISSN:0300-5127
DOI:10.1042/bst0280269