linked to PubMed where applicable.
To examine the possible relationship of guanine-dependent GpA conformations with ribonucleotide cleavage, two potential of mean force (PMF) calculations were performed in aqueous solution. In the first calculation, the guanosine glycosidic (Gchi) angle was used as the reaction coordinate, and computations were performed on two GpA ionic species: protonated (neutral) or deprotonated (negatively charged) guanosine ribose O2 '. Similar energetic profiles featuring two minima corresponding to the anti and syn Gchi regions were obtained for both ionic forms. For both forms the anti conformation was more stable than the syn, and barriers of approximately 4 kcal/mol were obtained for the anti --> syn transition. Structural analysis showed a remarkable sensitivity of the phosphate moiety to the conformation of the Gchi angle, suggesting a possible connection between this conformation and the mechanism of ribonucleotide cleavage. This hypothesis was confirmed by the second PMF calculations, for which the O2 '--P distance for the deprotonated GpA was used as reaction coordinate. The computations were performed from two selected starting points: the anti and syn minima determined in the first PMF study of the deprotonated guanosine ribose O2'. The simulations revealed that the O2 ' attack along the syn Gchi was more favorable than that along the anti Gchi: energetically, significantly lower barriers were obtained in the syn than in the anti conformation for the O--P bond formation; structurally, a lesser O2 '--P initial distance, and a better suited orientation for an in-line attack was observed in the syn relative to the anti conformation. These results are consistent with the catalytically competent conformation of barnase-ribonucleotide complex, which requires a guanine syn conformation of the substrate to enable abstraction of the ribose h1 ' proton by the general base Glu73, thereby suggesting a coupling between the reactive substrate conformation and enzyme structure and mechanism. (c) 2007 Wiley-Liss, Inc.
The performance of methods for predicting protein-protein interactions at the atomic scale is assessed by evaluating blind predictions performed during 2005-2007 as part of Rounds 6-12 of the community-wide experiment on Critical Assessment of PRedicted Interactions (CAPRI). These Rounds also included a new scoring experiment, where a larger set of models contributed by the predictors was made available to groups developing scoring functions. These groups scored the uploaded set and submitted their own best models for assessment. The structures of nine protein complexes including one homodimer were used as targets. These targets represent biologically relevant interactions involved in gene expression, signal transduction, RNA, or protein processing and membrane maintenance. For all the targets except one, predictions started from the experimentally determined structures of the free (unbound) components or from models derived by homology, making it mandatory for docking methods to model the conformational changes that often accompany association. In total, 63 groups and eight automatic servers, a substantial increase from previous years, submitted docking predictions, of which 1994 were evaluated here. Fifteen groups submitted 305 models for five targets in the scoring experiment. Assessment of the predictions reveals that 31 different groups produced models of acceptable and medium accuracy-but only one high accuracy submission-for all the targets, except the homodimer. In the latter, none of the docking procedures reproduced the large conformational adjustment required for correct assembly, underscoring yet again that handling protein flexibility remains a major challenge. In the scoring experiment, a large fraction of the groups attained the set goal of singling out the correct association modes from incorrect solutions in the limited ensembles of contributed models. But in general they seemed unable to identify the best models, indicating that current scoring methods are probably not sensitive enough. With the increased focus on protein assemblies, in particular by structural genomics efforts, the growing community of CAPRI predictors is engaged more actively than ever in the development of better scoring functions and means of modeling conformational flexibility, which hold promise for much progress in the future. (c) 2007 Wiley-Liss, Inc.
BACKGROUND: In structural genomics, an important goal is the detection and classification of protein-protein interactions, given the structures of the interacting partners. We have developed empirical energy functions to identify native structures of protein-protein complexes among sets of decoy structures. To understand the role of amino acid diversity, we parameterized a series of functions, using a hierarchy of amino acid alphabets of increasing complexity, with 2, 3, 4, 6, and 20 amino acid groups. Compared to previous work, we used the simplest possible functional form, with residue-residue interactions and a stepwise distance-dependence. We used increased computational resources, however, constructing 290,000 decoys for 219 protein-protein complexes, with a realistic docking protocol where the protein partners are flexible and interact through a molecular mechanics energy function. The energy parameters were optimized to correctly assign as many native complexes as possible. To resolve the multiple minimum problem in parameter space, over 64000 starting parameter guesses were tried for each energy function. The optimized functions were tested by cross validation on subsets of our native and decoy structures, by blind tests on series of native and decoy structures available on the Web, and on models for 13 complexes submitted to the CAPRI structure prediction experiment. RESULTS: Performance is similar to several other statistical potentials of the same complexity. For example, the CAPRI target structure is correctly ranked ahead of 90% of its decoys in 6 cases out of 13. The hierarchy of amino acid alphabets leads to a coherent hierarchy of energy functions, with qualitatively similar parameters for similar amino acid types at all levels. Most remarkably, the performance with six amino acid classes is equivalent to that of the most detailed, 20-class energy function. CONCLUSION: This suggests that six carefully chosen amino acid classes are sufficient to encode specificity in protein-protein interactions, and provide a starting point to develop more complicated energy functions.
CAPRI is a community-wide experiment to test protein-protein docking methods in blind predictions. The Toronto meeting assessed structure predictions made from 2005-2007 on nine target protein-protein complexes or homodimers, and reported new developments in functions used to score predicted interactions, in treatment of conformational flexibility, and in taking nonstructural information into account in the predictions.
BACKGROUND: Most methods for predicting functional sites in protein 3D structures, rely on information on related proteins and cannot be applied to proteins with no known relatives. Another limitation of these methods is the lack of a well annotated set of functional sites to use as benchmark for validating their predictions. Experimental findings and theoretical considerations suggest that residues involved in function often contribute unfavorably to the native state stability. We examine the possibility of systematically exploiting this intrinsic property to identify functional sites using an original procedure that detects destabilizing regions in protein structures. In addition, to relate destabilizing regions to known functional sites, a novel benchmark consisting of a diverse set of hand-curated protein functional sites is derived. RESULTS: A procedure for detecting clusters of destabilizing residues in protein structures is presented. Individual residue contributions to protein stability are evaluated using detailed atomic models and a force-field successfully applied in computational protein design. The most destabilizing residues, and some of their closest neighbours, are clustered into destabilizing regions following a rigorous protocol. Our procedure is applied to high quality apo-structures of 63 unrelated proteins. The biologically relevant binding sites of these proteins were annotated using all available information, including structural data and literature curation, resulting in the largest hand-curated data set of binding sites in proteins available to date. Comparing the destabilizing regions with the annotated binding sites in these proteins, we find that the overlap is on average limited, but significantly better than random. Results depend on the type of bound ligand. Significant overlap is obtained for most polysaccharide- and small ligand-binding sites, whereas no overlap is observed for most nucleic acid binding sites. These differences are rationalised in terms of the geometry and energetics of the binding site. CONCLUSION: We find that although destabilizing regions as detected here can in general not be used to predict binding sites in protein structures, they can provide useful information, particularly on the location of functional sites that bind polysaccharides and small ligands. This information can be exploited in methods for predicting function in protein structures with no known relatives. Our publicly available benchmark of hand-curated functional sites in proteins should help other workers derive and validate new prediction methods.
Genetic analysis of a large Indian family with an autosomal dominant cataract phenotype allowed us to identify a novel cataract gene, CRYBA4. After a genomewide screen, linkage analysis identified a maximum LOD score of 3.20 (recombination fraction [theta] 0.001) with marker D22S1167 of the beta -crystallin gene cluster on chromosome 22. To date, CRYBA4 was the only gene in this cluster not associated with either human or murine cataracts. A pathogenic mutation was identified in exon 4 that segregated with the disease status. The c.317T-->C sequence change is predicted to replace the highly conserved hydrophobic amino acid phenylalanine94 with the hydrophilic amino acid serine. Modeling suggests that this substitution would significantly reduce the intrinsic stability of the crystalline monomer, which would impair its ability to form the association modes critical for lens transparency. Considering that CRYBA4 associates with CRYBB2 and that the latter protein has been implicated in microphthalmia, mutational analysis of CRYBA4 was performed in 32 patients affected with microphthalmia (small eye). We identified a c.242T-->C (Leu69Pro) sequence change in exon 4 in one patient, which is predicted here to disrupt the beta -sheet structure in CRYBA4. Protein folding would consequently be impaired, most probably leading to a structure with reduced stability in the mutant. This is the first report linking mutations in CRYBA4 to cataractogenesis and microphthalmia.
The current status of docking procedures for predicting protein-protein interactions starting from their three-dimensional (3D) structure is reassessed by evaluating blind predictions, performed during 2003-2004 as part of Rounds 3-5 of the community-wide experiment on Critical Assessment of PRedicted Interactions (CAPRI). Ten newly determined structures of protein-protein complexes were used as targets for these rounds. They comprised 2 enzyme-inhibitor complexes, 2 antigen-antibody complexes, 2 complexes involved in cellular signaling, 2 homo-oligomers, and a complex between 2 components of the bacterial cellulosome. For most targets, the predictors were given the experimental structures of 1 unbound and 1 bound component, with the latter in a random orientation. For some, the structure of the free component was derived from that of a related protein, requiring the use of homology modeling. In some of the targets, significant differences in conformation were displayed between the bound and unbound components, representing a major challenge for the docking procedures. For 1 target, predictions could not go to completion. In total, 1866 predictions submitted by 30 groups were evaluated. Over one-third of these groups applied completely novel docking algorithms and scoring functions, with several of them specifically addressing the challenge of dealing with side-chain and backbone flexibility. The quality of the predicted interactions was evaluated by comparison to the experimental structures of the targets, made available for the evaluation, using the well-agreed-upon criteria used previously. Twenty-four groups, which for the first time included an automatic Web server, produced predictions ranking from acceptable to highly accurate for all targets, including those where the structures of the bound and unbound forms differed substantially. These results and a brief survey of the methods used by participants of CAPRI Rounds 3-5 suggest that genuine progress in the performance of docking methods is being achieved, with CAPRI acting as the catalyst.
Increasingly complex schemes for representing solvent effects in an implicit fashion are being used in computational analyses of biological macromolecules. These schemes speed up the calculations by orders of magnitude and are assumed to compromise little on essential features of the solvation phenomenon. In this work we examine this assumption. Five implicit solvation models, a surface area-based empirical model, two models that approximate the generalized Born treatment and a finite difference Poisson-Boltzmann method are challenged in situations differing from those where these models were calibrated. These situations are encountered in automatic protein design procedures, whose job is to select sequences, which stabilize a given protein 3D structure, from a large number of alternatives. To this end we evaluate the energetic cost of burying amino acids in thousands of environments with different solvent exposures belonging, respectively, to decoys built with random sequences and to native protein crystal structures. In addition we perform actual sequence design calculations. Except for the crudest surface area-based procedure, all the tested models tend to favor the burial of polar amino acids in the protein interior over nonpolar ones, a behavior that leads to poor performance in protein design calculations. We show, on the other hand, that three of the examined models are nonetheless capable of discriminating between the native fold and many nonnative alternatives, a test commonly used to validate force fields. It is concluded that protein design is a particularly challenging test for implicit solvation models because it requires accurate estimates of the solvation contribution of individual residues. This contrasts with native recognition, which depends less on solvation and more on other nonbonded contributions.
Given the increasing interest in protein-protein interactions, the prediction of these interactions from sequence and structural information has become a booming activity. CAPRI, the community-wide experiment for assessing blind predictions of protein-protein interactions, is playing an important role in fostering progress in docking procedures. At the same time, novel methods are being derived for predicting regions of a protein that are likely to interact and for characterizing putative intermolecular contacts from sequence and structural data. Together with docking procedures, these methods provide an integrated computational approach that should be a valuable complement to genome-scale experimental studies of protein-protein interactions.
CCR5 is a G protein-coupled receptor responding to four natural agonists, the chemokines RANTES (regulated on activation normal T cell expressed and secreted), macrophage inflammatory protein (MIP)-1 alpha, MIP-1 beta, and monocyte chemotactic protein (MCP)-2, and is the main co-receptor for the macrophage-tropic human immunodeficiency virus strains. We have previously identified a structural motif in the second transmembrane helix of CCR5, which plays a crucial role in the mechanism of receptor activation. We now report the specific role of aromatic residues in helices 2 and 3 of CCR5 in this mechanism. Using site-directed mutagenesis and molecular modeling in a combined approach, we demonstrate that a cluster of aromatic residues at the extracellular border of these two helices are involved in chemokine-induced activation. These aromatic residues are involved in interhelical interactions that are key for the conformation of the helices and govern the functional response to chemokines in a ligand-specific manner. We therefore suggest that transmembrane helices 2 and 3 contain important structural elements for the activation mechanism of chemokine receptors, and possibly other related receptors as well.
CAPRI is a communitywide experiment to assess the capacity of protein-docking methods to predict protein-protein interactions. Nineteen groups participated in rounds 1 and 2 of CAPRI and submitted blind structure predictions for seven protein-protein complexes based on the known structure of the component proteins. The predictions were compared to the unpublished X-ray structures of the complexes. We describe here the motivations for launching CAPRI, the rules that we applied to select targets and run the experiment, and some conclusions that can already be drawn. The results stress the need for new scoring functions and for methods handling the conformation changes that were observed in some of the target systems. CAPRI has already been a powerful drive for the community of computational biologists who development docking algorithms. We hope that this issue of Proteins will also be of interest to the community of structural biologists, which we call upon to provide new targets for future rounds of CAPRI, and to all molecular biologists who view protein-protein recognition as an essential process. Copyright 2003 Wiley-Liss, Inc.
The current status of docking procedures for predicting protein-protein interactions starting from their three-dimensional structure is assessed from a first major evaluation of blind predictions. This evaluation was performed as part of a communitywide experiment on Critical Assessment of PRedicted Interactions (CAPRI). Seven newly determined structures of protein-protein complexes were available as targets for this experiment. These were the complexes between a kinase and its protein substrate, between a T-cell receptor beta-chain and a superantigen, and five antigen-antibody complexes. For each target, the predictors were given the experimental structures of the free components, or of one free and one bound component in a random orientation. The structure of the complex was revealed only at the time of the evaluation. A total of 465 predictions submitted by 19 groups were evaluated. These groups used a wide range of algorithms and scoring functions, some of which were completely novel. The quality of the predicted interactions was evaluated by comparing residue-residue contacts and interface residues to those in the X-ray structures and by analyzing the fit of the ligand molecules (the smaller of the two proteins in the complex) or of interface residues only, in the predicted versus target complexes. A total of 14 groups produced predictions, ranking from acceptable to highly accurate for five of the seven targets. The use of available biochemical and biological information, and in one instance structural information, played a key role in achieving this result. It was essential for identifying the native binding modes for the five correctly predicted targets, including the kinase-substrate complex where the enzyme changes conformation on association. But it was also the cause for missing the correct solution for the two remaining unpredicted targets, which involve unexpected antigen-antibody binding modes. Overall, this analysis reveals genuine progress in docking procedures but also illustrates the remaining serious limitations and points out the need for better scoring functions and more effective ways for handling conformational flexibility. Copyright 2003 Wiley-Liss, Inc.
Homology modeling in combination with transmembrane topology predictions are used to build the atomic model of Neurospora crassa plasma membrane H+-ATPase, using as template the 2.6 A crystal structure of rabbit sarcoplasmic reticulum Ca2+-ATPase [Toyoshima, C., Nakasako, M., Nomura, H. & Ogawa, H. (2000) Nature 405, 647-655]. Comparison of the two calcium-binding sites in the crystal structure of Ca2+-ATPase with the equivalent region in the H+-ATPase model shows that the latter is devoid of most of the negatively charged groups required to bind the cations, suggesting a different role for this region. Using the built model, a pathway for proton transport is then proposed from computed locations of internal polar cavities, large enough to contain at least one water molecule. As a control, the same approach is applied to the high-resolution crystal structure of halorhodopsin and the proton pump bacteriorhodopsin. This revealed a striking correspondence between the positions of internal polar cavities, those of crystallographic water molecules and, in the case of bacteriorhodopsin, the residues mediating proton translocation. In our H+-ATPase model, most of these cavities are in contact with residues previously shown to affect coupling of proton translocation to ATP hydrolysis. A string of six polar cavities identified in the cytoplasmic domain, the most accurate part of the model, suggests a proton entry path starting close to the phosphorylation site. Strikingly, members of the haloacid dehalogenase superfamily, which are close structural homologs of this domain but do not share the same function, display only one polar cavity in the vicinity of the conserved catalytic Asp residue.
An automatic protein design procedure was used to compute amino acid sequences of peptides likely to bind the HLA-A2 major histocompatibility complex (MHC) class I allele. The only information used by the procedure are a structural template, a rotamer library, and a well established classical empirical force field. The calculations are performed on six different templates from x-ray structures of HLA-A0201-peptide complexes. Each template consists of the bound peptide backbone and the full atomic coordinates of the MHC protein. Sequences within 2 kcal/mol of the minimum energy sequence are computed for each template, and the sequences from all the templates are combined and ranked by their energies. The five lowest energy peptide sequences and five other low energy sequences re-ranked on the basis of their similarity to peptides known to bind the same MHC allele are chemically synthesized and tested for their ability to bind and form stable complexes with the HLA-A2 molecule. The most efficient binders are also tested for inhibition of the T cell receptor recognition of two known CD8(+) T effectors. Results show that all 10 peptides bind the expected MHC protein. The six strongest binders also form stable HLA-A2-peptide complexes, albeit to varying degrees, and three peptides display significant inhibition of CD8(+) T cell recognition. These results are rationalized in light of our knowledge of the three-dimensional structures of the HLA-A2-peptide and HLA-A2-peptide-T cell receptor complexes.
This review describes computational procedures for deriving the amino acid sequences that are compatible with a given protein backbone structure. Such procedures can be used to gain insight into the constraints imposed by the 3D structure of the protein sequence, or to design proteins that are likely to adopt a given backbone conformation. We start by presenting a short overview of the various types of approaches to protein design developed over more than a decade. This is followed by a more detailed presentation of a recently developed sequence selection procedure DESIGNER. This latter presentation illustrates the basic principles underlying this type of procedures, described what they may teach us when applied to small proteins, and highlights issues that need to be addressed in order to go forward.
The thyrotropin (TSH) receptor is an interesting model to study G protein-coupled receptor activation as many point mutations can significantly increase its basal activity. Here, we identified a molecular interaction between Asp(633) in transmembrane helix 6 (TM6) and Asn(674) in TM7 of the TSHr that is crucial to maintain the inactive state through conformational constraint of the Asn. We show that these residues are perfectly conserved in the glycohormone receptor family, except in one case, where they are exchanged, suggesting a direct interaction. Molecular modeling of the TSHr, based on the high resolution structure of rhodopsin, strongly favors this hypothesis. Our approach combining site-directed mutagenesis with molecular modeling shows that mutations disrupting this interaction, like the D633A mutation in TM6, lead to high constitutive activation. The strongly activating N674D (TM7) mutation, which in our modeling breaks the TM6-TM7 link, is reverted to wild type-like behavior by an additional D633N mutation (TM6), which would restore this link. Moreover, we show that the Asn of TM7 (conserved in most G protein-coupled receptors) is mandatory for ligand-induced cAMP accumulation, suggesting an active role of this residue in activation. In the TSHr, the conformation of this Asn residue of TM7 would be constrained, in the inactive state, by its Asp partner in TM6.
CCR5 is a G-protein-coupled receptor activated by the chemokines RANTES (regulated on activation normal T cell expressed and secreted), macrophage inflammatory protein 1alpha and 1beta, and monocyte chemotactic protein 2 and is the main co-receptor for the macrophage-tropic human immunodeficiency virus strains. We have identified a sequence motif (TXP) in the second transmembrane helix of chemokine receptors and investigated its role by theoretical and experimental approaches. Molecular dynamics simulations of model alpha-helices in a nonpolar environment were used to show that a TXP motif strongly bends these helices, due to the coordinated action of the proline, which kinks the helix, and of the threonine, which further accentuates this structural deformation. Site-directed mutagenesis of the corresponding Pro and Thr residues in CCR5 allowed us to probe the consequences of these structural findings in the context of the whole receptor. The P84A mutation leads to a decreased binding affinity for chemokines and nearly abolishes the functional response of the receptor. In contrast, mutation of Thr-82(2.56) into Val, Ala, Cys, or Ser does not affect chemokine binding. However, the functional response was found to depend strongly on the nature of the substituted side chain. The rank order of impairment of receptor activation is P84A > T82V > T82A > T82C > T82S. This ranking of impairment parallels the bending of the alpha-helix observed in the molecular simulation study.
The most abundant alpha-amylase inhibitor (AAI) present in the seeds of Amaranthus hypochondriacus, a variety of the Mexican crop plant amaranth, is the smallest polypeptide (32 residues) known to inhibit alpha-amylase activity of insect larvae while leaving that of mammals unaffected. In solution, 1H NMR reveals that AAI isolated from amaranth seeds adopts a major trans (70%) and minor cis (30%) conformation, resulting from slow cis-trans isomerization of the Val15-Pro16 peptide bond. Both solution structures have been determined using 2D 1H-NMR spectroscopy and XPLOR followed by restrained energy refinement in the consistent-valence force field. For the major isomer, a total of 563 distance restraints, including 55 medium-range and 173 long-range ones, were available from the NOESY spectra. This rather large number of constraints from a protein of such a small size results from a compact fold, imposed through three disulfide bridges arranged in a cysteine-knot motif. The structure of the minor cis isomer has also been determined using a smaller constraint set. It reveals a different backbone conformation in the Pro10-Pro20 segment, while preserving the overall global fold. The energy-refined ensemble of the major isomer, consisting of 20 low-energy conformers with an average backbone rmsd of 0.29 +/- 0.19 A and no violations larger than 0.4 A, represents a considerable improvement in precision over a previously reported and independently performed calculation on AAI obtained through solid-phase synthesis, which was determined with only half the number of medium-range and long-range restraints reported here, and featured the trans isomer only. The resulting differences in ensemble precision have been quantified locally and globally, indicating that, for regions of the backbone and a good fraction of the side chains, the conformation is better defined in the new solution structure. Structural comparison of the solution structure with the X-ray structure of the inhibitor when bound to its alpha-amylase target in Tenebrio molitor shows that the backbone conformation is only slightly adjusted on complexation, while that of the side chains involved in protein-protein contacts is similar to those present in solution. Therefore, the overall conformation of AAI appears to be predisposed to binding to its target alpha-amylase, confirming the view that it acts as a lid on top of the alpha-amylase active site.
A fully automatic procedure for predicting the amino acid sequences compatible with a given target structure is described. It is based on the CHARMM package, and uses an all atom force-field and rotamer libraries to describe and evaluate side-chain types and conformations. Sequences are ranked by a quantity akin to the free energy of folding, which incorporates hydration effects. Exact (Branch and Bound) and heuristic optimisation procedures are used to identifying highly scoring sequences from an astronomical number of possibilities. These sequences include the minimum free energy sequence, as well as all amino acid sequences whose free energy lies within a specified window from the minimum. Several applications of our procedure are illustrated. Prediction of side-chain conformations for a set of ten proteins yields results comparable to those of established side-chain placement programs. Applications to sequence optimisation comprise the re-design of the protein cores of c-Crk SH3 domain, the B1 domain of protein G and Ubiquitin, and of surface residues of the SH3 domain. In all calculations, no restrictions are imposed on the amino acid composition and identical parameter settings are used for core and surface residues. The best scoring sequences for the protein cores are virtually identical to wild-type. They feature no more than one to three mutations in a total of 11-16 variable positions. Tests suggest that this is due to the balance between various contributions in the force-field rather than to overwhelming influence from packing constraints. The effectiveness of our force-field is further supported by the sequence predictions for surface residues of the SH3 domain. More mutations are predicted than in the core, seemingly in order to optimise the network of complementary interactions between polar and charged groups. This appears to be an important energetic requirement in absence of the partner molecules with which the SH3 domain interacts, which were not included in the calculations. Finally, a detailed comparison between the sequences generated by the heuristic and exact optimisation algorithms, commends a note of caution concerning the efficiency of heuristic procedures in exploring sequence space. Copyright 2000 Academic Press.
The clearance of seven different ligands from the deeply buried active-site of Torpedo californica acetylcholinesterase is investigated by combining multiple copy sampling molecular dynamics simulations, with the analysis of protein-ligand interactions, protein motion and the electrostatic potential sampled by the ligand copies along their journey outwards. The considered ligands are the cations ammonium, methylammonium, and tetramethylammonium, the hydrophobic methane and neopentane, and the anionic product acetate and its neutral form, acetic acid. We find that the pathways explored by the different ligands vary with ligand size and chemical properties. Very small ligands, such as ammonium and methane, exit through several routes. One involves the main exit through the mouth of the enzyme gorge, another is through the so-called back door near Trp84, and a third uses a side door at a direction of approximately 45 degrees to the main exit. The larger polar ligands, methylammonium and acetic acid, leave through the main exit, but the bulkiest, tetramethylammonium and neopentane, as well as the smaller acetate ion, remain trapped in the enzyme gorge during the time of the simulations. The pattern of protein-ligand contacts during the diffusion process is highly non-random and differs for different ligands. A majority is made with aromatic side-chains, but classical H-bonds are also formed. In the case of acetate, but not acetic acid, the anionic and neutral form, respectively, of one of the reaction products, specific electrostatic interactions with protein groups, seem to slow ligand motion and interfere with protein flexibility; protonation of the acetate ion is therefore suggested to facilitate clearance. The Poisson-Boltzmann formalism is used to compute the electrostatic potential of the thermally fluctuating acetylcholinesterase protein at positions actually visited by the diffusing ligand copies. Ligands of different charge and size are shown to sample somewhat different electrostatic potentials during their migration, because they explore different microscopic routes. The potential along the clearance route of a cation such as methylammonium displays two clear minima at the active and peripheral anionic site. We find moreover that the electrostatic energy barrier that the cation needs to overcome when moving between these two sites is small in both directions, being of the order of the ligand kinetic energy. The peripheral site thus appears to play a role in trapping inbound cationic ligands as well as in cation clearance, and hence in product release. Copyright 2000 Academic Press.
Barnase, an extracellular endoribonuclease from Bacillus amyloliquefaciens, hydrolyses single-stranded RNA. Its very low catalytic activity toward GpN dinucleotides, where N stands for any nucleoside, is markedly increased when a phosphate is added to the 3'-end, as in GpNp. Here we investigate the conformational properties of GpA and GpAp in solution, in order to determine whether differences in these properties may be related to the changes in enzymatic activity. Two independent 1.3 ns molecular dynamics trajectories are generated for each dinucleotide in the presence of explicit water molecules and counter ions. These trajectories are analysed by monitoring molecular properties, such as the solvent accessible surface area, the distance and orientation between the bases, the behaviour of torsion angles and formation of intramolecular H-bonds. To identify relevant correlations between these parameters, statistical techniques, comprising multiple regression, clustering and discriminant analysis are used. Results show that GpA has a significant propensity to form folded conformations (approximately 50%), fostered by a small number of intramolecular H-bonds, whereas GpAp remains essentially extended. The latter behaviour seems to be due to an H-bond between the terminal phosphate and adenosine ribose group, which restricts rotation about the adenine Agamma angle. We also find that GpA folding is induced by a concerted motion of specific torsion angles, which is closely coupled to the formation of a network of flexible hydrogen bonds. Finally, on the basis of an expression for barnase KM, which incorporates the folded/extended conformational equilibria of the dinucleotide substrates, it is argued that our findings on the differences between these equilibria, can qualitatively rationalize the experimentally measured differences in enzymatic properties. Copyright 1998 Academic Press.
BACKGROUND: The classical picture of the hydrophobic stabilization of proteins invokes a resemblance between the protein interior and nonpolar solvents, but the extent to which this is the case has often been questioned. The protein interior is believed to be at least as tightly packed as organic crystals, and was shown to have very low compressibility. There is also evidence that these properties are not uniform throughout the protein, and conflicting views exist on the nature of sidechain packing and on its influence on the properties of the protein. RESULTS: In order to probe the physical properties of the protein, the free energy associated with the formation of empty cavities has been evaluated for two proteins: barnase and T4 lysozyme. To this end, the likelihood of encountering such cavities was computed from room temperature molecular dynamics trajectories of these proteins in water. The free energy was evaluated in each protein taken as a whole and in submolecular regions. The computed free energies yielded information on the manner in which empty space is distributed in the system, while the latter undergoes thermal motion, a property hitherto not analyzed in heterogeneous media such as proteins. Our results showed that the free energy of cavity formation is higher in proteins than in both water and hexane, providing direct evidence that the native protein medium differs in fundamental ways from the two liquids. Furthermore, although the packing density was found to be higher in nonpolar regions of the protein than in polar ones, the free energy cost of forming atomic size cavities is significantly lower in nonpolar regions, implying that these regions contain larger chunks of empty space, thereby increasing the likelihood of containing atomic size packing defects. These larger empty spaces occur preferentially where buried hydrophobic sidechains belonging to secondary structures meet one another. These particular locations also appear to be more compressible than other parts of the core or surface of the protein. CONCLUSIONS: The cavity free energy calculations described here provide a much more detailed physical picture of the protein matrix than volume and packing calculations. According to this picture, the packing of hydrophobic sidechains is tight in the interior of the protein, but far from uniform. In particular, the packing is tighter in regions where the backbone forms less regular hydrogen-bonding interactions than at interfaces between secondary structure elements, where such interactions are fully developed. This may have important implications on the role of sidechain packing in protein folding and stability.
Database-derived potentials, compiled from frequencies of sequence and structure features, are often used for scoring the compatibility of protein sequences and conformations. It is often believed that these scores correspond to differences in free energy with, in addition, a term containing the partition function of the system. Since this function does not depend on the conformation, the potentials are considered to be valid for scoring the compatibility of different conformations with a given sequence ('forward folding'), but not of sequences with a given structure ('inverted folding'). This interpretation is questioned here. It is argued that when many body-effects, which dominate frequencies compiled from the protein database, are corrected for, the potentials approximate a physically meaningful free energy difference from which the partition function term cancels out. It is the difference between the free energy of a given sequence in a specific conformation and that of the same sequence in a denatured-like state. Two examples of denatured-like states are discussed. Depending on the considered state, the free energy difference reduces to the commonly used scoring scheme, or contains additional terms that depend on the sequence. In both cases, all the terms can be derived from sequence-structure frequencies in the database. Such free energy difference, commonly defined as the folding free energy, is a measure of protein stability and can be used for scoring both forward and inverted protein folding. The implications for the use of knowledge-based potentials in protein structure prediction are described. Finally, the difficulty of designing tests that could validate the proposed approach, and the inherent limitations of such tests, are discussed.
Molecular dynamics simulations are used to investigate the unfolding reaction of an isolated beta-hairpin formed by residues 85 to 102 of barnase, a ribonuclease from Bacillus amyloliquefaciens. This peptide was considered following evidence from experimental studies that it may act as an initiation site for barnase folding by adopting a native-like conformation early during the folding process. Three successive molecular dynamics simulations of about 300 ps each were carried out for an all-atom model of the hairpin in water at 300 K, 450 K, and 600 K, respectively. A detailed analysis of all three simulations is presented. In particular we investigate the behavior of the backbone hydrogen bonds, and of hydrophobic interactions between side-chains, where distinction is made between contributions from native and non-native contacts, respectively. Furthermore, we investigate peptide water interactions and monitor the presence and size of empty cavities. The behavior of the hairpin in the three simulations, when considered sequentially, describes a process whereby a native-like conformation evolves to an unfolded state. Unfolding starts at the beginning of the 450 K simulation with the loss of two hydrogen bonds at the free hairpin extremities. At about the same time, the centrally located H-bonds are weakened and exchange more frequently with water, but the turn tightens up as the beta-sheet extends into the turn region. All this is accompanied by a volume expansion and the formation of a large hydrophobic side-chain cluster promoted by both native and highly fluctuating non-native apolar contacts involving residues 87 to 90 and 95 to 99. This collapsed but more loosely packed state, essentially stabilized by hydrophobic interactions, is stable throughout the entire 450 K simulation and for about 150 ps at 600 K, after which point it proceeds rapidly to completely denatured conformations. This behavior presents clear analogies with known features of the unfolding reaction of complete proteins. It may indicate that this beta-hairpin has a well-defined conformation on its own, which would be in agreement with its role as an initiation site for folding.
The interactions between HIV-1 protease and its bound inhibitors have been investigated by molecular mechanics calculations and by analysis of crystal structures of the complexes in order to determine general rules for inhibitor and substrate binding to the protease. Fifteen crystal structures of HIV-1 protease with different peptidomimetic inhibitors showed conservation of hydrogen bond interactions between the main chain C = O and NH groups of the inhibitors and the C = O and NH groups of the protease extending from P3 C = O to P3' NH. The mean length of the hydrogen bonds between the inhibitor and the flexible flaps and the conserved water molecule (2.9 A) is slightly shorter than the mean length of hydrogen bonds between the inhibitor and the more rigid active site region (3.1 A) of the protease. The two hydrogen bonds between the conserved water and P2 and P1' carbonyl oxygen atoms of the inhibitor are the shortest and are predicted to be important for the tight binding of inhibitors. Molecular mechanics analysis of three crystal structures of HIV-1 protease with different inhibitors with independent calculations using the programs Discover and Brugel gave an estimate of 56-68% for the contribution of all the inhibitor main chain atoms to the total calculated protease-inhibitor interaction energy. The contribution of individual inhibitor residues to the interaction energy was calculated using Brugel. The main chain atoms of residue P2 had a consistently large favorable contribution to the total interaction energy, probably due to the presence of the two short hydrogen bonds to the flexible flap.(ABSTRACT TRUNCATED AT 250 WORDS)
This study reports the structure of the peptide hormone oxytocin bound to its carrier protein, neurophysin I, obtained by nuclear magnetic resonance techniques. At the pH value of 2.1 in our experiments, the ligand is in fast exchange with its carrier protein, allowing the use of transfer-NOE methods. The number of distance constraints for the peptide being limited, considerable attention has been paid to an accurate distance determination. The resulting accurate distance limits were used as input for a distance geometry calculation followed by a restrained molecular dynamics run. Convergence to a well-defined family of structures for oxytocin in its bound state was reached. Both the backbone and the side-chain conformations differ between the bound form and the crystal structure of free oxytocin [Wood, S. P., et al. (1986) Science 232, 633]. These differences, as well as other structural features of the bound form, are discussed in terms of interactions made with the carrier protein. Transfer-NOE experiments at low peptide protein ratios provide direct experimental evidence for contacts between the oxytocin Tyr2 residue and an aromatic residue of neurophysin. The resonance assignments of the aromatic groups [Whittaker, B. A., et al. (1985) Biochemistry 24, 2782] together with the recently published X-ray structure of the neurophysin II protein complexed with a dipeptide [Chen et al. (1991) Proc. Natl. Acad. Sci. U.S.A. 88, 4240] allow us to assign the aromatic signal on the protein to the neurophysin Phe22 residue.
We present a geometric analysis of the allosteric interface in the new Y state quaternary structure observed in liganded mutant hemoglobin Ypsilanti (beta 99 Asp-->Tyr) by Smith, F.R., Lattman, E.E., Carter, C.W., Jr. (Proteins 10: 81-91, 1991). The classical T to R quaternary structure change being a rotation of alpha beta dimers about an axis which is approximately parallel to the dimer axis of pseudosymmetry, the new quaternary structure is obtained by applying to R an additional rotation about an axis orthogonal to the first. This suggests that Y is a modified R state rather than an intermediate on the T to R pathway. Computer docking experiments designed to simulate the quaternary structure change support this suggestion.
Free energy simulation methods are used to analyse the effects of the mutation Arg-96----His on the stability of bacteriophage T4 lysozyme and of Ile-96----Ala on the stability of barnase. By use of thermodynamic integration, the contributions of specific interactions to the free energy change are evaluated. It is shown that a number of contributions that stabilize the wild-type or the mutant partially cancel in the overall free energy difference; some of these involve the unfolded state. Comparison of the results with conclusions based on structural and thermodynamic data leads to new insights into the origin of the stability difference between wild-type and mutant proteins. For the charged-to-charged amino acid mutation in T4 lysozyme, the importance of the contributions of more distant residues, solvent water and the covalent linkage involving the mutated amino acid are of particular interest. Also, the analysis of the Arg-96 to His mutation with respect to the interactions with the C-terminal end of a helix (residues 82-90) indicates that the nearby carbonyl groups (Tyr-88 and Asp-89) make the dominant contribution, that the amide groups do not contribute significantly and that the helix dipole model is inappropriate for this case. For the non-polar-to-non-polar amino acid mutation in barnase, the solvent contribution is unimportant, and covalent terms are shown to be significant because they do not cancel between the folded and unfolded state.
Molecular dynamics simulations have been used to compute the difference in the unfolding free energy between wild-type barnase and the mutant in which Ile-96 is replaced by alanine. The simulations yield results (-3.42 and -5.21 kcal/mol) that compare favorably with experimental values (-3.3 and -4.0 kcal/mol). The major contributions to the free energy difference arise from bonding terms involving degrees of freedom of the mutated side chain and from nonbonded interactions of that side chain with its environment in the folded protein. By comparison with simulations of an extended peptide in the absence of solvent, used as a reference state, hydration effects are shown to play a minor role in the overall free energy balance for the Ile----Ala transformation. The implications of these results for our understanding of the hydrophobic effect and its contribution to protein stability are discussed.
A system is described that provides ways of integrating data on protein structure, sequence, and survey results, with molecular graphics and molecular mechanics software. Its major component is the relational database SESAM, presently implemented under the commercial package SYBASE. By design, the database allows full integration--within the same data organization--of raw data on protein structure, sequence, ligands, and heterogroups, obtained from the Brookhaven Protein Databank, with pure sequence information available from other databanks such as SWISS-PROT. It contains in addition higher level descriptions of structural and topological properties, as well as survey results, obtained by executing specialized computer programs. Aside from the very useful attribute of closely combining structural and nonstructural information, other important features distinguish it from analogous systems developed elsewhere. It includes a molecular dictionary with complete description of geometric properties and energy parameters used in modeling and conformational energy calculations. Using this dictionary, structural data are validated by checking for localized inconsistencies in atomic coordinates, atomic symbols, chirality definitions, and flagging errors and incomplete entries. Because of both the dictionary and the validation procedures, SESAM can be readily interfaced with conventional molecular graphics and mechanics software packages, or with other specialized application programs. With the aid of appropriate interfaces, data access is sufficiently fast for SESAM to be interrogated interactively. Prototypes of user interfaces, as well as an interface with the molecular graphics package BRUGEL, are described and the power of the system is illustrated in applications such as homology-based protein modeling, computer-aided protein design, protein structure predictions, analysis of local structure motifs, and of relationships between protein sequence and structure.
Artemia has a complex extracellular hemoglobin of Mr 260,000 comprising two globin chains (Mr 130,000) each of which is a polymer of eight covalently linked domains of Mr 16,000. The primary structure of this polymeric globin was studied to understand how globin folded domains are ordered within a globin chain and, in turn, how the latter associate into a functional hemoglobin molecule. Here we report the amino acid sequence of a second domain, E7 (Mr 16,081, excluding the heme), and interpretations of sequence data by computer-assisted alignment and modeling. This clearly shows that, as with domain E1 (Moens, L., Van Hauwaert, M.-L., De Smet, K., Geelen, D., Verpooten, G., Van Beeumen, J., Wodak, S., Alard, P., & Trotman, C. (1988) J. Biol. Chem. 263, 4679-4685), domain E7 is compatible with a globin folded structure of the beta-type chain. Several specific differences of domains E7 and E1 from the classic globins are identified. They possibly can be interpreted in terms of specific requirements for a double octameric functional molecule.
The calculation of induced dipole moments and of their contribution to electrostatic effects in proteins is implemented following the approach of Warshel. Isotropic polarizabilities are assigned to individual atoms, and the resulting deviation from pairwise interactions is treated by a self-consistent iterative procedure. We give a detailed description of how the formalism is implemented in molecular mechanics and molecular dynamics simulation procedures, and report results based on calculations performed on crystal structures of crambin, liver alcohol dehydrogenase and ribonuclease T1. We focus our analysis on evaluating the contribution of polarizability of the protein matrix to electrostatic energies, local fields, to dipole moments of peptide groups and of secondary structure elements in the polypeptide chain. Our calculations confirm that induced dipole moments in proteins provide important stabilizing contributions to electrostatic energies, and that these contributions cannot be mimicked by the usual approximations where either a continuum dielectric constant, or a distance-dependent dielectric function is used. We find that induced protein dipoles appreciably affect the magnitude and direction of local electrostatic fields in a manner that is strongly influenced by the microscopic environment in the protein. Most strongly affected are fields in charged groups that are involved in close interactions with other charged groups, while the influence on local fields of aliphatic groups is marginal. We find, moreover, that induction effects from surrounding protein atoms tend on average to increase peptide dipoles and helix macro-dipoles by about 16%, again reflecting electrostatic stabilization by the protein matrix, and show that (at least in the alpha/beta domain of alcohol dehydrogenase) the contribution of side-chains to this stabilization is significant.
Isolated insulin monomers, the dimer and higher aggregates from the 2 Zn crystal structure are subjected to convergent energy minimization in Cartesian co-ordinates using a force-field that includes the position of all hydrogen atoms. The minimizations are found, for the first time, to produce conformational changes of appreciable magnitude, which agree well with observed structural differences between monomers in the 2 Zn crystal and with the mechanism proposed previously for the coupling between deformations in different parts of the molecule. Our results also suggest that insulin would tend to adopt a molecule 1-like conformation in the absence of crystal packing forces, and that dimer formation is not at the origin of the observed asymmetry in the 2 Zn crystal.