Simplified Normal Mode Analysis of Conformational Transitions in DNA-dependent Polymerases: the Elastic Network Model

https://doi.org/10.1016/S0022-2836(02)00562-4Get rights and content

Abstract

The Elastic Network Model is used to investigate the open/closed transition in all DNA-dependent polymerases whose structure is known in both forms. For each structure the model accounts well for experimental crystallographic B-factors. It is found in all cases that the transition can be well described with just a handful of the normal modes. Usually, only the lowest and/or the second lowest frequency normal modes deduced from the open form give rise to calculated displacement vectors that have a correlation coefficient larger than 0.50 with the observed difference vectors between the two forms. This is true for every structural class of DNA-dependent polymerases where a direct comparison with experimental structural data is available. In cases where only one form has been observed by X-ray crystallography, it is possible to make predictions concerning the possible existence of another form in solution by carefully examining the vector displacements predicted for the lowest frequency normal modes. This simple model, which has the advantage to be computationally inexpensive, could be used to design novel kind of drugs directed against polymerases, namely drugs preventing the open/closed transition from occurring in bacterial or viral DNA-dependent polymerases.

Introduction

DNA replication, transcription and maintenance are essential functions in all forms of life. All these tasks are performed, at least for the polynucleotide synthesis part, by DNA-dependent polymerases, which have been characterized in all kingdoms of life.1

Following the first structure determination of a DNA polymerase,2 a large body of structural and functional data has been accumulated in the last decade on this otherwise very diverse superfamily of proteins. In particular, an avalanche of new polymerase structures has appeared recently, from which it has been possible to identify a few unifying principles.3

One of them is that the number of different folds adopted by the different subfamilies identified through sequence alignments4 is smaller than previously thought. For instance the catalytic domains of pol A (pol I) and pol B (pol α) families of DNA-dependent DNA polymerases were found by X-ray analysis to adopt the same fold,5 as predicted earlier by sequence motifs alignment.6 Similarly, single subunit DNA-dependent RNA polymerases were also predicted to adopt the pol I fold6 and this prediction has been confirmed later by crystallography.7 However, the catalytic domain of human DNA polymerase β (pol β) adopts a completely different fold.8 Multisubunit RNA polymerases also have a different and more complicated fold and architecture.9

Even if the folds are different, their overall architecture can all be described with the hand metaphor.2., 3. DNA polymerase structural organization is highly modular and made up of several domains which are described as the palm, fingers and thumb domains. The catalytic site is on the surface of the palm domain.

At the atomic level, their active sites can all be described with the so-called two-metal-ions mechanism.10 They all contain a few strictly conserved and crucial aspartate residues arranged in space in a similar manner to coordinate two metal ions that will (i) activate the 3′OH of the primer strand to be elongated and (ii) assist in the departure of the PPi moiety of the incoming dNTP. This is true for the pol α and pol I structural class, the pol β structural class as well as for multisubunit RNA polymerases, even though their catalytic (palm) domains all have different folds.

Finally, at least one member of these different classes has been found in two different forms, the open and closed forms.11., 12. The existence of the two different forms was found when tertiary complexes could be crystallized, some of them displaying catalytic activity in the crystal state.8., 13., 14., 15., 16. The closed form allows for an intimate “grip” of the enzyme on its template substrate while the open form is necessary as a relaxed form in order for translocation to occur. For processive enzymes, the protein is expected to cycle between these two forms.

All of these points are also valid for RNA-dependent polymerases, as determined by X-ray crystallography. They all adopt the pol I fold as predicted earlier by sequence analysis.6., 17. However, we will leave aside reverse transcriptases in this article since there are already a number of recent studies devoted to a dynamical transition in reverse transcriptase by molecular dynamics18., 19. or by normal mode analysis.20 We will only touch upon RNA-dependent RNA polymerases to make predictions, as there exists only one known form of these polymerases up to now and mainly concentrate on DNA-dependent polymerases where there are several examples of open and closed conformations.

The transition between the open and closed forms of polymerases is usually associated with the limiting step of the polymerization reaction, which is described in terms of an induced fit transition.21 Indeed, this transition is thought to occur only if the incoming dNTP is complementary to the base being copied. This step is therefore directly associated with the fidelity of the copying reaction.22., 23.

Recently, two new structures of polymerases belonging to the so-called pol Y family24 have been solved. This family is able to bypass lesions on the DNA (hence its name of translesion polymerases) and is involved in DNA repair mechanisms.25 By virtue of its function, this family replicates DNA in an error-prone fashion. One of the recently solved structures is the yeast pol η polymerase26 and the other is the archaebacterial analogue of the Escherichia coli dinB protein.27., 28. Both were found in the closed conformation, even though they were crystallized in their unliganded form. The general interpretation of this phenomenon is that polymerases with low fidelity do not display the open/closed transition, which is needed only when high fidelity is needed.22

An additional line of evidence in favour of this interpretation can be found in the recent structure of murine terminal desoxynucleotidyltransferase (TdT).29 TdT, whose sequence is closely related to pol μ, a newly discovered human polymerase thought to be involved in hypermutagenesis,30 turns out to be structurally highly similar to the structure of pol β, which is itself involved in DNA repair. TdT can be considered as an extreme case of an error-prone polymerase since it is a template-independent polymerase, which adds random nucleotides to the N regions of V(D)J junctions of immunoglobulin genes, thereby contributing to the generation of the diversity of response of the immune system. Again, TdT was found to resemble most closely to the closed form of pol β, even in the unliganded state. A detailed analysis of the structure suggests that TdT may be permanently locked in the closed form in solution.29

Pol β actually stands as a representative member of an entire family comprising pol X type polymerases (pol β and pol λ), template independent polymerases (pol μ and TdT, poly (A) polymerase, oligo(A) 2′-5′ polymerase) as well as several other nucleotidyltransferases.31

It seems clear that the open/closed transition is of utmost functional significance in polymerases. Several questions can be raised: is this transition encoded in the structure itself? Is it an intrinsic property, a natural tendency of this mechanical system? Can it be predicted from the structure alone?

We have sought further clarification of this conformational transition by studying the normal modes calculated from both the open and closed forms, in all cases where they are available. The reason behind using Normal Mode Analysis (NMA) is that we are looking for large amplitude movements that are unlikely to occur during the time scales available to the molecular dynamics method.32., 33., 34., 35. Instead, the lowest frequency modes of NMA should give some clues about the correlated movements likely to occur in these large proteins. Because they are highly collective movements, the lowest frequency modes are the ones with the largest amplitudes, at a given temperature (and energy).

Whenever structural information is available both for the open and the closed forms, it is possible to do a direct comparison between the movements predicted by the theory for the different modes and the difference vectors between the closed and open forms.

Here, we show that the open/closed transition can be mapped to a handful of the lowest frequency normal modes determined directly from the structure of the open form using a simple and computationally inexpensive method borrowed from the field of structural mechanics.36 The method, which essentially says that all atoms are linked by springs, i.e. experiencing an harmonic potential bringing them back to their equilibrium position, was simplified by Hinsen37., 38. who showed that it was possible to use only Cα coordinates instead of all atoms. Bahar and colleagues39., 40. also developed the method along slightly different lines involving the inversion of a Kirchhoff contact matrix.

Lowest frequency modes calculated by this method (The Elastic Network Model) can be shown to be sufficient to explain the open/closed transition for all the cases where both the open and closed forms are known, in the different structural classes of pol I and pol α, pol β and multisubunit RNA polymerases. The latter case, which concerns very large proteins of about 3500 residues,9., 41. was made possible by a recent and even more drastic simplification of the method, which was shown recently to perform well in test cases,42., 43. and which consists in grouping residues in super particles.44., 45. Normal mode analysis of systems of such a large size would simply be impossible with currently available computers, using standard methods, or would require large amounts of CPU-time on supercomputers, using iterative methods like DIMB.46

Although this method uses very grossly approximated energy functions (for instance, there is no explicit account of the solvent), it has the advantage of computational speed, allowing different hypotheses to be tested quickly. Also, it does not require energy minimization of the X-ray structure prior to normal mode calculations since it assumes that the X-ray structure is the minimum of the energy function. The method can also be used to make predictions in cases where only one form is known, as in the case of the RNA-dependent RNA polymerase of hepatitis C virus.

Section snippets

Correlation between the calculated and the observed B-factors.

To test the validity of the Elastic Network Model, the experimental crystallographic B-factors were compared to the mean quadratic displacement values of each of the Cα atoms, based on a subset of the 100 lowest frequency normal modes. This is illustrated in Figure 1 for pol α, as a function of residue number, and in Table 1 for all known structures of DNA-dependent polymerases. In most cases, the correlation coefficient between these two quantities is around 50% (see Table 1). This compares

Why does the simplified NMA method work so well?

It is surprising that such a crude and low-resolution model gives good results for the prediction of large amplitude movements in polymerases.

Stereochemistry is not taken into account in this model, neither the excluded volume effect nor electrostatics. Rather, the protein is modelled as a solid at zero temperature, where solvent effects are effectively ignored. The range of frequencies predicted by the model span the range of acoustic modes in a solid and agrees with experimental values

Materials and Methods

The PDB coordinates of the closed and open forms of DNA-dependent polymerases are the following: 1BPX and 1BPY for pol β16 (pol X family); 2KTQ and 3KTQ for Taq pol I15 (pol A family); 1IH7 and 1IG9 for phage RB69 polymerase, a representative member of the pol α family5., 48. (pol B family); 1I50 and 1I6H for yeast multisubunits RNA polymerase II9., 41.; 1CEZ and 1ARO for the monosububit T7 RNA polymerase7., 50. and 1HQM51 for T. thermophilus multisubunit RNA polymerase. The E. coli model fitted

Acknowledgements

We thank S. Bressanelli and F. A. Rey for communicating results prior to publication on HCV RNA polymerase complexed with various dNTPs.

This study was initiated in the framework of the G.D.R. “Simulation et Modelisation des molecules biologiques” (CNRS), coordinated by T. Simonson (IGBMC, Illkirch, France). Financial support from Association de la Recherche contre le Cancer (grant no. 5470 to M.D.) is gratefully acknowledged.

References (63)

  • I. Bahar et al.

    Direct evaluation of thermal fluctuations in proteins using a single-parameter harmonic potential

    Fold. Design

    (1997)
  • D. Perahia et al.

    Computation of low-frequency normal modes in macromolecules: improvements of the method of diagonalization in a mixed basis and application to hemoglobin

    Comput. Chem.

    (1995)
  • M.C. Franklin et al.

    Structure of the replicating complex of a Pol alpha family DNA polymerase

    Cell

    (2001)
  • Y. Shamoo et al.

    Building a replisome from interacting pieces: sliding clamp complexed to a peptide from DNA polymerase and a polymerase editing complex

    Cell

    (1999)
  • G. Zhang et al.

    Crystal structure of T. aquaticus RNA polymerase at 3.3 Å resolution

    Cell

    (1999)
  • H. Ling et al.

    Crystal structure of a Y-family DNA polymerase in action: a mechanism for error-prone and lesion-bypass replication

    Cell

    (2001)
  • S. Cusack et al.

    Temperature dependence of the low frequency dynamics of myoglobin

    Biophys. J.

    (1990)
  • A. Kitao et al.

    Investigating protein dynamics in collective coordinate space

    Curr. Opin. Struct. Biol.

    (1999)
  • D. Ollis et al.

    Structure of the large fragment of E. coli DNA polymerase I complexed with TMP

    Nature

    (1985)
  • J. Ito et al.

    Compilation and alignment of DNA polymerase sequences

    Nucl. Acids Res.

    (1991)
  • J. Wang et al.

    Crystal structure of a pol alpha family replication DNA polymerase from bacteriophage RB69

    Cell

    (1999)
  • M. Delarue et al.

    An attempt to unify the structure of polymerases

    Protein Eng.

    (1990)
  • R. Sousa et al.

    The crystal structure of bacteriophage T7 RNA polymerase at 3.3 Å resolution

    Nature

    (1993)
  • H. Pelletier et al.

    Structures of ternary complexes of rat DNA polymerase β, a DNA template primer and ddCTP

    Science

    (1994)
  • P. Cramer et al.

    Structural basis of transcription: RNA polymerase II at 2.8 Å resolution

    Science

    (2001)
  • T.A. Steitz et al.

    A general two-metal ions mechanism for catalytic RNA

    Proc. Natl Acad. Sci. USA

    (1993)
  • P.H. Patel et al.

    Getting a grip on how polymerases function

    Nature Struct. Biol.

    (2001)
  • S. Doublié et al.

    Crystal structure of a bacteriophage T7 DNA replication complex at 2.2 Å resolution

    Nature

    (1998)
  • J.R. Kiefer et al.

    Visualizing DNA replication in a catalytically active Bacillus DNA polymerase crystal

    Nature

    (1998)
  • Y. Li et al.

    Crystal structures of open and closed forms of binary and ternary complexes of the large fragment of T. aquaticus DNA polymerase I: structural basis for nucleotide incorporation

    EMBO J.

    (1998)
  • M.R. Sawaya et al.

    Crystal structures of DNA polymerase beta complexed with gapped and nicked DNA: evidence for an induced fit mechanism

    Biochemistry

    (1997)
  • Cited by (231)

    View all citing articles on Scopus
    View full text