Algorithm discovery by protein folding game players

Khatib, F., Cooper S., Tyka M. D., Xu K., Makedon I., Popovic Z., et al. Proc Natl Acad Sci USA (2011)

To determine whether high performing Foldit player strategies could be collectively codified, we augmented the Foldit gameplay mechanics with tools for players to encode their folding strategies as “recipes” and to share their recipes with other players, who are able to further modify and redistribute them. Players developed over 5,400 different recipes, and two of the recipes became particularly dominant. Examination of the algorithms encoded in these two recipes revealed a striking similarity to an unpublished algorithm developed by scientists over the same period. Benchmark calculations show that the new algorithm independently discovered by scientists and by Foldit players outperforms previously published methods. Thus, online scientific game frameworks have the potential not only to solve hard scientific problems, but also to discover and formalize effective new strategies and algorithms.

Crystal structure of a monomeric retroviral protease solved by protein folding game players

Khatib, F., DiMaio F., Contenders Group F., Void Crushers Group F. et al. Nature Structural & Molecular Biology (2011)

Following the failure of a wide range of attempts to solve the crystal structure of the Mason-Pfizer monkey virus (M-PMV) retroviral protease by molecular replacement, we challenged players of the protein folding game Foldit to produce accurate models of the protein. Remarkably, Foldit players were able to generate models of sufficient quality for successful molecular replacement and subsequent structure determination. This is the first example we are aware of in which non-scientists have solved a long standing scientific problem.




Computational design of protein inhibitors of Spanish and avian flu hemagglutinin

Fleishman, S. J., Whitehead T. A., et al. Science 332, 816-821. (2011)

We developed a new computational method for designing protein interactions
and applied it to the design of anti-flu hemagglutinin inhibitors. The
proteins were designed using computational resources generously provided by
Rosetta @ Home participants and inhibited the function of the hemagglutinin
flu coat protein that is crucial for viral infectivity. An experimentally
determined molecular structure of the designed protein interacting with
Spanish flu hemagglutinin (Figure) shows unprecedented level of agreement
between model and experiment for a designed interface.

Improved molecular replacement by density- and energy-guided protein structure optimization

DiMaio, F., Terwilliger T. C et al. Nature (2011)

We show that the crystallographic phase problem can be solved using distant evolutionary relationships by combining algorithms for protein structure modelling with those developed for crystallographic structure determination. Ingegrating Rosetta structure modelling with Autobuild chain tracing yielded high-resolution structures for 8 of 13 X-ray diffraction data sets that could not be solved in the laboratories of expert crystallographers and that remained unsolved after application of an extensive array of alternative approaches. We estimate that the new method should allow rapid structure determination without experimental phase information for over half the cases where current methods fail, given diffraction data sets of better than 3.2 Å resolution, four or fewer copies in the asymmetric unit, and the availability of structures of homologous proteins with  >20% sequence identity.

Determining large protein structures (>200 amino acids) from limited NMR data using Rosetta

Raman, Lange et al, Science, 327, 1014-8. (2010) 

Large protein structures can now be determined by incorporating backbone-only NMR data into Rosetta. Shown here is the structural comparison of ALG13 (201 amino acids) determined.(A) computationally using RDCs and backbone NH-NH NOEs. (B) experimentally by conventional NMR methods (PDB id : 2jzc)

Nuclear Magnetic Resonance (NMR) is a powerful method for determining protein structures in the physiologically relevant solution state. Chemical shift, which is a unique signature of a protein atom’s microenvironment, is required for all backbone and sidechain atoms to determine the NMR structure by conventional methods. While backbone chemical shifts can be assigned relatively quickly and in an automated fashion, assigning sidechain chemical shifts can be significantly more complex, time-consuming and expensive. For increasingly larger proteins, the NMR spectrum gets extremely crowded, rendering it virtually impossible to assign the spectrum.

Computational design and optimization of Kemp eliminases

Rothlisberger D., Khersonsky O. et al, Nature, 453, 164-6. (2008)

Khersonsky O., Rothlisberger D. et al, J Mol Biol, 396(4), 1025-42. (2010)

 We  designed several enzymes that catalyze Kemp elimination, a model reaction for proton transfer from carbon. These enzymes utilize two catalytic motifs and enhance the reaction rate up to 105-fold with multiple turnovers.  Application of in vitro evolution to enhance the computational designs KE07, KE70, and KE59 produced up to 2000-fold increases in catalytic activity, with kcat/Km values reaching 105-106 M-1s-1

Computational design of an enzyme catalyst for a stereoselective bimolecular Diels-Alder reaction

 Siegel and Zanghellini et al, Science, 329, 309-313. (2010)

Structure of a designed Diels-Alderase. (A) Surface view of the design model (DA_20_00, green) bound to the substrates (diene and dienophile, purple). The catalytic residues making the designed hydrogen bonds are depicted as sticks. (B) Overlay of the design model (DA_20_00,brown) and crystal structure of DA_20_00_A74I (green).  The crystal structure shows that the design was within atomic level accuracy (0.5A all-atom RMSD) 

Structure prediction problems solved by Foldit players

Cooper, Khatib et al, Nature, 466, 756-60. (2010)

Examples of blind structure prediction problems in which players were successfully able to improve structures. Native structures are shown in blue, starting puzzles in red, and top scoring Foldit predictions in green.

(a) The red starting puzzle had a register shift and the top scoring green Foldit prediction correctly flips and slides the beta strand.

(b) On the same structure as above, Foldit players correctly buried an exposed Isoleucine in the loop on the bottom right by remodeling the loop backbone.

Exploitation of binding energy for catalysis and design

Thyme, Summer B., et all. Nature 461, 1300-4. (2009)

The monomeric homing endonuclease I-AniI cleaves with high sequence specificity in the center of a 20 base-pair DNA target site, with the N-terminal domain of the enzyme making extensive binding interactions with the left (-) side of the target site and the similarly structured C-terminal domain interacting with the right (+) side. Despite the approximate two-fold symmetry of the enzyme-DNA complex, we find that there is almost complete segregation of interactions responsible for substrate binding to the (-) side of the interface and interactions responsible for transition state stabilization to the (+) side.

De novo computational design of retro-aldol enzymes

Jiang, Lin, Althoff Eric A. et al, Science, 319, 1387-91. (2008)

Using new algorithms that employ hashing techniques to construct active sites for multi-step reactions, we designed retro-aldolases that employ four different catalytic motifs to catalyze the breaking of a carbon-carbon bond in a non-natural substrate. Designs that utilized an explicit water molecule to mediate proton shuffling were significantly more successful, with rate accelerations of up to four orders of magnitude and multiple turnovers, than those involving charged sidechain networks. The atomic accuracy of the design process was confirmed by the X-ray crystal structure of active designs embedded in two protein scaffolds, both of which were nearly superimposable on the design model.
Syndicate content