Evolution of protein sequences and structures

J Mol Biol. 1999 Aug 27;291(4):977-95. doi: 10.1006/jmbi.1999.2972.

Abstract

The relationship between sequence similarity and structural similarity has been examined in 36 protein families with five or more diverse members whose structures are known. The structural similarity within a family (as determined with the DALI structure comparison program) is linearly related to sequence similarity (as determined by a Smith-Waterman search of the protein sequences in the structure database). The correlation between structural similarity and sequence similarity is very high; 18 of the 36 families had linear correlation coefficients r>/=0.878, and only nine had correlation coefficients r</=0.815. Inclusion of higher-order terms in the structure/sequence relationship improved the fit by less than 7% in 27 of the 36 families. Differences in sequence/structure correlations are distributed evenly among the four protein structural classes, alpha, beta, alpha/beta, and alpha+beta. While most protein families show high correlations between sequence similarity and structural similarity, the amount of structural change per sequence change, i.e. the structural mutation sensitivity, varies almost fourfold. Protein families with high and low structural mutation sensitivity are distributed evenly among protein structure classes. In addition, we did not detect strong correlations between structural mutation sensitivity and either protein family mutation rates or protein size. Our results are more consistent with models of protein structure that encode a protein family's fold throughout the protein sequence, and not just in a few critical residues.

Publication types

  • Comparative Study
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Evolution, Molecular*
  • Mutation
  • Protein Conformation
  • Protein Folding
  • Proteins / chemistry*
  • Proteins / classification
  • Proteins / genetics*
  • Regression Analysis
  • Sequence Homology, Amino Acid

Substances

  • Proteins