Annals of Combinatorics 3 (1999) 431-450

Strategies for Protein Folding and Design

Cristian Micheletti1, Flavio Seno2, Amos Maritan1, and Jayanth R. Banavar3

1The Abdus Salam Centre for Theoretical Physics, INFM-International School for Advanced Studies (S.I.S.S.A.), Via Beirut 2-4, 34014 Trieste, Italy
{michelet, maritan}

2INFM-Dipartimento di Fisica, Universit&$224 di Padova, Via Marzolo 8, 35131 Padova, Italy

3Department of Physics and Center for Materials Physics, 104 Davey Laboratory, The Pennsylvania State University, University Park, Pennsylvania 16802, USA

Received November 10, 1998

AMS Subject Classification: 82D60, 82B20, 82B30, 82B80

Abstract. Fundamental challenges in molecular biology can be addressed by using simple models on a lattice, where statistical mechanics and combinatoric techniques can be employed. The basic premise is that it is sensible to test any proposed method on the simplest of models in order to assess their validity before launching a full-scale attack on realistic problems. In this paper we follow this strategy and we present different efficient schemes to perform protein design and to extract effective amino acid interaction potentials.

Keywords: proteing folding, protein design, lattice models, exect enumerations


1.  C. Anfinsen, Principles that govern the folding of protein chains, Science 181 (1973) 223–230.

2.  J.U. Bowie, R. Luthy, and D. Eisenberg, A method to identify protein sequences that fold into a known 3-dimensional structure, Science 253 (1991) 164–170.

3.  C. Branden and J. Tooze, Introduction to Protein Structure, Garland Publishing, New York, 1991.

4.  S.H. Bryant and C.E. Lawrence, An empirical energy function for threading protein sequence through the folding motif, Proteins: Struct. Funct. Genet. 16 (1993) 92–112.

5.  J. Bryngelson, J.N. Onuchic, J.N. Socci, and P.G. Wolynes, Funnels, pathways and the energy landscape of protein folding: A synthesis, Proteins: Struc. Funct. Genet. 21 (1995) 167–195.

6.  C.J. Camacho and D. Thirumalai, Kinetics and thermodynamics of folding in model proteins, Proc. Nat. Acad. Sci. USA 90 (1993) 6369–6372.

7.  H.S. Chan and K.A. Dill, Sequence space soup of proteins and copolymers, J. Chem. Phys. 95 (1991) 3775–3787.

8.  M.H.J. Cordes, A.R. Davidson, and R.T. Sauer, Sequence space, folding and protein design, Curr. Opin. in Struct. Biol. 6 (1996) 3–10.

9.  D.G. Covell and R. Jernigan, Conformations of folded proteins in restricted spaces, Biochemistry 29 (1990) 3287–3294.

10.  T.E. Creighton, Proteins: Structures and Molecular Properties, W. H. Freeman, New York, 1992.

11.  G.M. Crippen, Prediction of protein folding from amino acid sequence over discrete conformation space, Biochemistry 30 (1991) 4232–4237.

12.  G.M. Crippen, Failures of inverse folding and threading with gapped alignement, Proteins: Struct. Funct. Genet. (1996) 167–171.

13.  J.M. Deutsch and T. Kurosky, New algorithm for protein design, Phys. Rev. Lett. 76 (1996) 323–326.

14.  K.A. Dill, S. Bromberg, S. Yue, K. Fiebig, K.M. Yee, P.D. Thomas, and H.S. Chan, Principles of protein folding --- a perspective from simple exact models, Protein Science 4 (1995) 561–602.

15.  H. Flockner, M. Braxenthaler, P. Lackner, M. Jaritz, M. Ortner, and M.J. Sippl, Progress in fold recognition, Proteins: Struct. Funct. Genet. 23 (1995) 376–386.

16.  M.S. Friederichs and P.G. Wolynes, Towards protein tertiary structure recognition by means of associative memory hamiltonians, Science 246 (1989) 371–373.

17.  A. Godzik and J. Skolnick, Sequence structure matching in globular proteins: Application to supersecondary and tertiary structure determination, Proc. Nat. Acad. Sci. USA 89 (1992) 12098–12102.

18.  R.A. Goldstein, Z.A. Luthey-Schulten, and P.G. Wolynes, Optimal protein folding codes from spin glass theory, Proc. Nat. Acad. Sci. USA 89 (1992) 4918–4922.

19.  R.A. Goldstein, Z.A. Luthey-Schulten, and P.G. Wolynes, Protein tertiary structure recognition using optimized hamiltonians with local interactions, Proc. Nat. Acad. Sci. USA 89 (1992) 9029–9033.

20.  M. Hendlick, P. Lackner, S. Weitckus, H. Floeckner, R. Froschauerer, K. Gottsbacher, G. Casari, and M.J. Sippl, Identification of native protein folds amongst a large number of incorrect models. The calculation of low energy conformations from potentials of mean force, J. Mol. Biol. 216 (1990) 167–180.

21.  D.A. Hinds and M. Levitt, A lattice model for protein structure prediction at low resolution, Proc. Nat. Acad. Sci. USA 89 (1992) 2536–2540.

22.  D.T. Jones, W.R. Taylor, and J.M. Thorton, A new approach to protein fold recognition, Nature 358 (1992) 86.

23.  S. Kamtekar, J.M. Schiffer, H. Xiong, J.M. Babik, and M.H. Hecht, Protein design by binary patterning of polar and nonpolar amino acids, Science 262 (1993) 1680.

24.  Y. Kuroda, T. Nakai, and T. Ohkubo, Solution structure of a de-novo helical protein by 2D-NMR spectroscopy, J. Mol. Biol. 236 (1994) 862--868.

25.  K.F. Lau and K.A. Dill, A lattice statistical mechanics model of the conformational and sequences spaces of proteins, Macromolecules 22 (1989) 3986–3997.

26.  K.F. Lau and K.A. Dill, Theory for protein mutability and biogenesis, Proc. Nat. Acad. Sci. USA 87 (1990) 638–642.

27.  H. Li, C. Tang, and N. Wingreen, Nature of driving force for protein folding: A result from analyzing the statistical potential, Phys. Rev. Lett. 79 (1997) 765–768.

28.  A. Lombardi, J.W. Bryson, and W.F. DeGrado, De novo design of heteroritmetic coiled coils, Biopolym. Peptide Sci. 40 (1997) 495–504.

29.  V.N. Maiorov and G.M. Crippen, Contact potential that recognizes the correct folding of globular proteins, J. Mol. Biol. 227 (1992) 876–888.

30.  C. Micheletti, J.R. Banavar, A. Maritan, and F. Seno, Steric constraints in model proteins, Phys. Rev. Lett. 80 (1998) 5683–5686.

31.  C. Micheletti, J.R. Banavar, A. Maritan, and F. Seno, Protein structures and optimal folding from a geometrical variational principle, Phys. Rev. Lett. 82 (1999) 3372.

32.  C. Micheletti, A. Maritan, and J.A. Banavar, A comparative study of existing and new design techniques for protein models, J. Chem. Phys. 110 (1998) 9730–9738.

33.  C. Micheletti, F. Seno, A. Maritan, and J.R. Banavar, Protein design in a lattice model of hydrophobic and polar amino acids, Phys. Rev. Lett. 80 (1998) 2237–2240.

34.  C. Micheletti, F. Seno, A. Maritan, and J.R. Banavar, Design of proteins with hydrophobic and polar amino acids, Proteins: Struct. Funct. Genet. 32 (1998) 80–87.

35.  L.A. Mirny and E.I. Shakhnovich, How to derive a protein folding potential? A new approach to an old problem, J. Mol. Biol. 264 (1996) 1164–1179.

36.  S. Miyazawa and R.L. Jernigan, Estimation of effective interreside contact energies from protein crystal structures: Quasi-chemical approximation, Macromolecules 18 (1985) 534–552.

37.  S. Miyazawa and R.L. Jernigan, Residue-residue potentials with a favorable contact pair term an unfavorable high packing density term, for simulation and threading, J. Mol. Biol. 256 (1996) 623–644.

38.  M.P. Morrissey and E.I. Shakhnovich, Design of proteins with selected thermal properties, Folding and Design 1 (1996) 391–405.

39.  J. Moult, The current state of the art in protein structure prediction, Current Opinion in Biotechnology 7 (1996) 422-427; Proteins: Struct. Funct. Genet. 23 (1995), special issue.

40.  K. Nishikawa and Y. Matsuo, Development of pseudoenergy potentials for assessing protein 3-D -- 1-D compatibility and detecting weak homologies, Protein Eng. 6 (1993) 811–820.

41.  J.N. Onuchic, P.G. Wolynes, and N.D. Socci, Toward an outline of the topography of a realistic protein: Folding funnel, Proc. Nat. Acad. Sci. USA 92 (1995) 3626–3630.

42.  C. Pabo, Designing proteins and peptides, Nature 301 (1983) 200.

43.  B.H. Park and M. Levitt, The complexity and accuracy of discrete state models of protein structure, J. Mol. Biol. 249 (1995) 493–507.

44.  M. Pellegrini and S. Doniach, Computer simulation of antibody binding specificity, Proteins: Struct. Funct. Genet. 15 (1993) 436–444.

45.  J.W. Ponder and F.M. Richards, Tertiary templates for proteins use of packing criteria in the enumeration of allowed sequences for different structctures, J. Mol. Biol. 193 (1987) 775–791.

46.  T.P. Quinn, N.B. Tweedy, R.W. Williams, J.S. Richardson, and D.C. Richardson, De-nove design, synthesis and characterization of a beta sandwich protein, Proc. Nat. Acad. Sci. USA 91 (1994) 8747–8751.

47.  A. Rossi, private communication.

48.  A. Sali, E.I. Shakhnovich, and M. Karplus, How does a protein fold?, Nature 369 (1994) 248.

49.  F. Seno, A. Maritan, and J.R. Banavar, Interaction potentials for protein folding, Proteins: Struct. Funct. Genet. 30 (1998) 244–248.

50.  F. Seno, C. Micheletti, A. Maritan, and J.R. Banavar, Variational approach to protein design and extraction of interaction potentials, Phys. Rev. Lett. 81 (1998) 2172–2175.

51.  F. Seno, M. Vendruscolo, A. Maritan, and J.R. Banavar, Optimal protein design procedure, Phys. Rev. Lett. 77 (1996) 1901–1904.

52.  E.I. Shakhnovich, Proteins with selected sequences fold into unique native conformation, Phys. Rev. Lett. 72 (1994) 3907.

53.  E.I. Shakhnovich and A.M. Gutin, Engineering of stable and fast folding sequences of models proteins, Proc. Nat. Acad. Sci. USA 90 (1993) 7195–7199.

54.  I. Shrivastava, S. Vishveshwara, M. Cieplak, A. Maritan, and J.R. Banavar, Lattice model for rapidly folding protein-like heteropolymers, Proc. Nat. Acad. Sci. USA 92 (1995) 9206–9209.

55.  M.J. Sippl, Calculation of conformational ensembles from potentials of mean force: An approach to the knowledge based prediction of local structures in globular proteins, J. Mol. Biol. 213 (1990) 859–883.

56.  M.J. Sippl, Knowledge based potentials for proteins, Current Opinion in Structural Biology 5 (1995) 229–235.

57.  M.J. Sippl, M. Jaritz, M. Hendlich, M. Ortner, and P. Lackner, Statistical Mechanics, Protein Structure and Protein Substrate Interactions, S. Doniach, Ed., Plenum Publishers, New York, 1994.

58.  R. Srinivasan and G.D. Rose, LINUS: A hierarchic procedure to predict the fold of a protein, Proteins: Struct. Funct. Genet. 22 (1995) 81–99.

59.  S. Tanaka and H.A. Scheraga, Medium and long range interaction parameters between amino acids for predicting three-dimensional structures of proteins, Macromolecules 9 (1976) 945–950.

60.  P.D. Thomas and K. Dill, An iterative method for extracting energy-like quantities from protein structure, Proc. Nat. Acad. Sci. USA 93 (1996) 11628–11633.

61.  P.D. Thomas and K. Dill, Statistical potentials extracted from protein structures: How accurate are they?, J. Mol. Biol. 257 (1996) 457–469.

62.  K. Yue and K.A. Dill, Inverse protein folding problem: Designing polymer sequences, Proc. Nat. Acad. Sci. USA 89 (1992) 4163.

63.  K. Yue and K.A. Dill, Sequence-structure relationship in proteins and copolymers, Phys. Rev. E 48 (1993) 2267–2279.

64.  K. Yue and K.A. Dill, Forces of tertiary structural organization in globular proteins, Proc. Nat. Acad. Sci. USA 92 (1995) 146–150.

65.  K. Yue, K. Fiebig, P.D. Thomas, H.S. Chan, E.I. Shackhnovich, and K.A. Dill, A test of lattice protein folding algorithms, Proc. Nat. Acad. Sci. USA 92 (1995) 325.

Get the | DVI| PS | PDF file of this abstract.