Macromolecular Dynamics and Interactions Overview
Project Directors
Robert Jernigan (ISU) - Director, Laurence H. Baker Center for Bioinformatics and Biological Statistics
Jing He (NMSU) - Computer Science
Summary. Virtually all cellular processes depend on precisely orchestrated interactions mediated by proteins. Understanding a protein’s function requires not only detailed structural information, but also knowledge of the proteins, nucleic acids and/or other ligands with which it interacts in the context of the cell. Despite progress in structural genomics, high resolution structural information is available for only a small fraction of the proteins known to play critical roles in signal transduction, genetic, and metabolic networks. Even fewer molecular structures are available for macromolecular complexes in which the atomic details of protein-protein, protein-nucleic acid or protein small-ligand interfaces have been elucidated. CMB faculty are integrating computational prediction and experimental structure determination to gain new insights into how proteins and macromolecular complexes function. At ISU, interdisciplinary teams are developing and evaluating novel algorithms for structural threading applicable to genome-scale protein structure prediction, algorithms for knowledge-based prediction of interface residues in protein-protein and protein-nucleic acid complexes, and algorithms for identifying sequence correlates of structural/functional features in proteins. These computational approaches both inform the design of and benefit from the results of experimental approaches for direct determination of macromolecular structures and interrogation of complex interfaces using NMR, X-ray crystallography and mass spectroscopy. Scientists at NMSU are bridging the gap between the computational and experimental methods for structure prediction taking place at ISU. They are developing methods to predict structures of protein complexes by combining primary sequence and intermediate resolution structural data, which is obtained through electron cryo-microscopy. The research strengths of scientists from both institutions, therefore, are complementary and synergistic.
Computational modeling to gain insights into protein structure and function. The lentivirus subfamily of retroviruses includes several important pathogens of humans and domestic animals, including HIV-1, the causative agent of AIDS. All lentiviruses encode a regulatory protein, Rev, that facilitates export of incompletely spliced viral RNAs from the nucleus to the cytoplasm. ISU CMB faculty members are using a comparative approach, integrating computational and experimental tools, to investigate the structure and function of Rev (Drena Dobbs, Kai-Ming Ho, Amy Andreotti and Edward Yu - ISU).
Ho's group has developed a novel protein threading algorithm that can detect structural similarities in proteins, even when sequence identity is less than 10%. Application of the algorithm to predict Rev structures from several different lentiviruses has, for the first time, revealed overall structural similarities in Rev proteins that were not anticipated based on comparison of their primary amino acid sequences. Specifically, the models predict that the Rev response element-interacting domain of all Rev proteins lies within a helix-loop-helix motif and suggest that this domain is stabilized as a four-helix bundle. Dobbs' group is introducing mutations in predicted structural features of Rev and testing their biological effects. In addition, efforts are underway to generate stable domains of Rev that are soluble at concentrations required for NMR or X-ray crystallography. This integrated approach is anchored in computational, functional and structural analyses and has a high potential for yielding significant insights into the molecular determinants of RNA recognition of Rev.
Prediction and analysis of protein-protein and protein-RNA interactions. The ability to identify protein-protein interaction sites and to detect specific amino acid residues that contribute to the specificity and affinity of protein interactions has important implications for problems ranging from rational drug design to analysis of metabolic and signal transduction networks. ISU CMB members Vasant Honavar, Dobbs and Jernigan are developing knowledge-based approaches for predicting functionally important residues in proteins. They have focused on prediction of amino acid residues that participate in protein-protein interactions, and recently protein-RNA interactions, using a variety of data-driven approaches and algorithms, such as naive Bayes and support vector machines. Currently, when trained and tested on disjoint data sets of known protein complexes, these approaches can classify interface versus non-interface residues in protein-protein complexes with 75% accuracy and a correlation coefficient of 0.38. Evaluation of similar approaches for predicting amino acid residues involved in specific RNA binding have recently shown even better performance, 85% accuracy. Work in the laboratories of Dobbs and Andreotti is directed at evaluating these algorithms using "blind test" predictions. For example, RNA binding experiments have already confirmed that several predicted interface residues are indeed involved in the interaction of Rev proteins with their cognate RNA recognition sequences in lentiviral genomes. In addition, in collaboration with the Gloria Culver and Jernigan groups, manipulation of residues expected to alter protein-RNA interactions in the 30S ribosomal subunit (described below) will provide the opportunity to evaluate the utility of these approaches to identify interface residues, and to predict the effects of mutations on the affinity and avidity of these interactions.
Bridging the resolution gap: automatic sequence mapping for intermediate resolution of macro-molecular structures from electron cryo-microscopy. With recent advances in electron cryo-microscopy, protein density maps for large protein complexes, such as viruses, can readily be generated at intermediate resolution (6-9 Angstroms). Protein secondary structures, such as helices and beta-sheets, can be visualized and computationally located at this resolution range. However, methods used in X-ray crystallography for initial protein backbone determination are not directly applicable, since amino acid side chains in the protein are not typically distinguishable at this resolution. This makes it difficult to determine how the amino acid sequence folds in their protein density map, and significantly limits biological interpretation of the intermediate resolution map. NMSU scientists Jing He, Peter Lammers and Desh Ranjan are collaborating with scientists at the Sandia National labs (Faulon) and the University of Texas (Zhou) to develop computational methods and tools to predict initial protein backbone models using the constraints from an intermediate resolution protein density map. The constraints available in the density map include the geometrical description of the secondary structures, such as the length of helices, the size of beta-sheets, and the relative location and available connectivity between two secondary structure elements. The project involves structural feature extraction, mapping secondary structure features in the protein density map to its amino acid sequence using constraints from the protein density map, and density modeling. Because the project requires knowledge from structure determination, constraint optimization, structure prediction and efficient algorithm design, it complements efforts at ISU in the prediction and direct determination of macromolecular structures, especially work of Jernigan and Zhijun Wu (ISU) who are developing algorithms for refining NMR-determined protein structures using database-derived distance constraints.
Approaches for modeling and visualizing structural dynamics. Many biomolecular functions cannot be understood even when the static structures involved are known; deeper understanding requires approaches that can elucidate molecular motions. CMB faculty are combining experiment and computation to investigate the driving forces that underlie functional motions in macromolecules. One project focuses on the ribosome – a well-known target for antibiotics, genetic diseases, and anti-cancer drugs; however the mechanism of action of this "supramolecular assemblage" is not yet understood at the atomic level. The ISU groups of Jernigan and Culver are working together to simulate the functional motions of the ribosome, focusing on two specific problems: 1) understanding the roles and interactions of individual components of the structure, both in ribosome assembly and in the synthesis process, and 2) visualizing the mechanism of protein synthesis by bridging the gap between current coarse-grained simulations of its motions and the atomic details.
The Culver laboratory has used biochemical, structural and genetic approaches to dissect the pathway, dynamics, and components involved in 30S ribosomal subunit assembly. Her group has developed conditions for in vitro reconstitution of functional 30S subunits from a complete set of 21 recombinant proteins and 16S rRNA. In collaboration with Jernigan's group, the tools developed in Culver's lab are being used to test the hypothesis that structural changes manifested in biochemical modification data can be related to the structure and collective motions of partially assembled ribonucleoproteins. Jernigan has pioneered the use of coarse-grained protein models and simplified force fields to describe the molecular motions of large proteins. These computational methods are based on elastic network models that are particularly well-suited for analyzing the dynamics of large macromolecular assemblies and have been extremely successful in studies on the stability and dynamics of proteins, despite their mathematical simplicity. Their predictions usually show significantly better agreement with experimental X-ray data than do fully atomic Molecular Dynamics simulations.
Recently, Jernigan has developed a mixed coarse-grained method where the 'interesting' parts of proteins that are responsible for their functions are modeled at a higher resolution than the remainder of the structure. Using such models, it is possible to focus validly on the details of the biologically important parts of these molecules, and concentrate on the stability and dynamics associated with their function. Performing highly detailed calculations, with increased level of detail for the most important functional parts, should yield substantial new information about their functions, stabilities and dynamics. Importantly, these computational results can be combined with high-resolution chemical footprinting data to build models that describe the mechanism of ribosome assembly, and ultimately, the mechanism of protein synthesis.
A third component of this project is development of a virtual reality environment for visualizing the structure and motions of the 30S ribosome. Adrian Sannier's group is developing an immersive modeling and docking environment, CAVEMol, utilizing ISU’s Virtual Reality Applications Center C6 facility, a 6-wall, stereoscopic and fully immersive virtual environment. Interactive docking in this immersive environment offers significant advantages over docking with conventional viewers and is ideal for modeling movement within a molecule. Using CAVEMol to visualize the simulations generated by Jernigan's group in the context of the entire 30S subunit will permit facile evaluation of how well elastic network models "fit" biophysical data. The combination of computational modeling, biochemical probing, and interactive docking in an immersive virtual reality environment offer an unprecedented opportunity to advance our understanding of the structure and dynamics of the ribosome. The essential bridging tools developed in this work will open up new ways to address many other complex problems in molecular cell biology and provide a physically meaningful framework to connect structure and function.