linked to PubMed where applicable.
Increasingly complex schemes for representing solvent effects in an implicit fashion are being used in computational analyses of biological macromolecules. These schemes speed up the calculations by orders of magnitude and are assumed to compromise little on essential features of the solvation phenomenon. In this work we examine this assumption. Five implicit solvation models, a surface area-based empirical model, two models that approximate the generalized Born treatment and a finite difference Poisson-Boltzmann method are challenged in situations differing from those where these models were calibrated. These situations are encountered in automatic protein design procedures, whose job is to select sequences, which stabilize a given protein 3D structure, from a large number of alternatives. To this end we evaluate the energetic cost of burying amino acids in thousands of environments with different solvent exposures belonging, respectively, to decoys built with random sequences and to native protein crystal structures. In addition we perform actual sequence design calculations. Except for the crudest surface area-based procedure, all the tested models tend to favor the burial of polar amino acids in the protein interior over nonpolar ones, a behavior that leads to poor performance in protein design calculations. We show, on the other hand, that three of the examined models are nonetheless capable of discriminating between the native fold and many nonnative alternatives, a test commonly used to validate force fields. It is concluded that protein design is a particularly challenging test for implicit solvation models because it requires accurate estimates of the solvation contribution of individual residues. This contrasts with native recognition, which depends less on solvation and more on other nonbonded contributions.
In recent years a large body of data has been obtained from Nuclear Magnetic Resonance and Circular Dichroism experiments on the influence of the amino acid sequence and various other parameters on the conformational state of peptides in solution. Interpreting the experimental data in terms of the conformational populations of the peptides remains a key problem, for which current solutions leave appreciable room for improvement. Considering that making this body of data available for surveys and analysis should be instrumental in tackling the problem, we undertook the development of Pescador: The 'PEptides in Solution ConformAtion Database: Online Resource'. Pescador contains data from NMR and CD spectroscopy on peptides in solution as well as information on the structural parameters derived from these data. It also features specialized Web-based tools for data deposition, and means for readily accessing the stored information for analysis purposes. To illustrate the use of the database in deriving information for the conformational analysis of peptides, we show how the alpha proton delta-values stored in Pescador and measured by NMR for different peptides in different laboratories can be used to derive a new set of 'random coil' chemical shift values. Firstly, we show these values to be very similar to those obtained experimentally for model peptides in water, and their variation with increasing Tri-Fluoro-Ethanol (TFE) concentration is similar to that reported for model peptides. We show, furthermore, that the chemical shift data in Pescador can be used to derive correction factors that take into account effects of neighboring residues. These correction factors compare favorably with those recently derived from a series of model GGXGG peptides (Schwarzinger et al., 2001). These encouraging results suggest that, as the quantity of NMR data on peptide deposited in Pescador increases, surveys of these data should be a valuable means of deriving key parameters for the analysis of peptide conformation.
A set of conserved water positions making direct contacts with the alpha1 and alpha2 domains of the MHC class-I protein was identified by a cluster analysis in 12 high-resolution crystal structures of proteins from different allele types and different species, comprising human, mouse and rat. The analysis revealed a total of 63 clusters, corresponding to water molecules, whose positions are conserved in half or more of the analyzed structures. Analysis of these clusters shows that the most conserved water positions-those appearing in the largest fraction of the structures-were also the most accurately defined, as measured by their normalized crystallographic B-factor. Not too surprisingly, these positions displayed better overlap and formed more H-bonds with the protein. In a second part of this work, a detailed analysis is presented of three of the most conserved water positions and their putative structural and functional roles are discussed. The most highly conserved of the three appears to play an important role in stabilizing the conformation of a twisted beta-turn between residues 118 and 122 (numbering of HLA-B3501, PDB code 1A1N). An equivalent water molecule was found to be associated with a similar beta-turn in 43 unrelated structures surveyed in the PDB, leading to the suggestion that this water molecule plays an important structural role in this type of turn. The second water molecule makes hydrogen bonds with residues lining pocket B in the peptide-binding groove and is suggested to play a role in modulating peptide recognition. The third highly conserved water molecule is located at the first kink of the alpha2 helix, possibly playing a role in determining the position of the N-terminal segment of that helix, which also carries side chains in contact with the bound peptide. This information on conserved water positions in MHC class-I molecules should be helpful in modeling interactions with bound peptide antigens and in designing new peptides with tailor-made affinities.
Barnase, an extracellular endoribonuclease from Bacillus amyloliquefaciens, hydrolyses single-stranded RNA. Its very low catalytic activity toward GpN dinucleotides, where N stands for any nucleoside, is markedly increased when a phosphate is added to the 3'-end, as in GpNp. Here we investigate the conformational properties of GpA and GpAp in solution, in order to determine whether differences in these properties may be related to the changes in enzymatic activity. Two independent 1.3 ns molecular dynamics trajectories are generated for each dinucleotide in the presence of explicit water molecules and counter ions. These trajectories are analysed by monitoring molecular properties, such as the solvent accessible surface area, the distance and orientation between the bases, the behaviour of torsion angles and formation of intramolecular H-bonds. To identify relevant correlations between these parameters, statistical techniques, comprising multiple regression, clustering and discriminant analysis are used. Results show that GpA has a significant propensity to form folded conformations (approximately 50%), fostered by a small number of intramolecular H-bonds, whereas GpAp remains essentially extended. The latter behaviour seems to be due to an H-bond between the terminal phosphate and adenosine ribose group, which restricts rotation about the adenine Agamma angle. We also find that GpA folding is induced by a concerted motion of specific torsion angles, which is closely coupled to the formation of a network of flexible hydrogen bonds. Finally, on the basis of an expression for barnase KM, which incorporates the folded/extended conformational equilibria of the dinucleotide substrates, it is argued that our findings on the differences between these equilibria, can qualitatively rationalize the experimentally measured differences in enzymatic properties. Copyright 1998 Academic Press.
BACKGROUND: The classical picture of the hydrophobic stabilization of proteins invokes a resemblance between the protein interior and nonpolar solvents, but the extent to which this is the case has often been questioned. The protein interior is believed to be at least as tightly packed as organic crystals, and was shown to have very low compressibility. There is also evidence that these properties are not uniform throughout the protein, and conflicting views exist on the nature of sidechain packing and on its influence on the properties of the protein. RESULTS: In order to probe the physical properties of the protein, the free energy associated with the formation of empty cavities has been evaluated for two proteins: barnase and T4 lysozyme. To this end, the likelihood of encountering such cavities was computed from room temperature molecular dynamics trajectories of these proteins in water. The free energy was evaluated in each protein taken as a whole and in submolecular regions. The computed free energies yielded information on the manner in which empty space is distributed in the system, while the latter undergoes thermal motion, a property hitherto not analyzed in heterogeneous media such as proteins. Our results showed that the free energy of cavity formation is higher in proteins than in both water and hexane, providing direct evidence that the native protein medium differs in fundamental ways from the two liquids. Furthermore, although the packing density was found to be higher in nonpolar regions of the protein than in polar ones, the free energy cost of forming atomic size cavities is significantly lower in nonpolar regions, implying that these regions contain larger chunks of empty space, thereby increasing the likelihood of containing atomic size packing defects. These larger empty spaces occur preferentially where buried hydrophobic sidechains belonging to secondary structures meet one another. These particular locations also appear to be more compressible than other parts of the core or surface of the protein. CONCLUSIONS: The cavity free energy calculations described here provide a much more detailed physical picture of the protein matrix than volume and packing calculations. According to this picture, the packing of hydrophobic sidechains is tight in the interior of the protein, but far from uniform. In particular, the packing is tighter in regions where the backbone forms less regular hydrogen-bonding interactions than at interfaces between secondary structure elements, where such interactions are fully developed. This may have important implications on the role of sidechain packing in protein folding and stability.
Free energy simulation methods are used to analyse the effects of the mutation Arg-96----His on the stability of bacteriophage T4 lysozyme and of Ile-96----Ala on the stability of barnase. By use of thermodynamic integration, the contributions of specific interactions to the free energy change are evaluated. It is shown that a number of contributions that stabilize the wild-type or the mutant partially cancel in the overall free energy difference; some of these involve the unfolded state. Comparison of the results with conclusions based on structural and thermodynamic data leads to new insights into the origin of the stability difference between wild-type and mutant proteins. For the charged-to-charged amino acid mutation in T4 lysozyme, the importance of the contributions of more distant residues, solvent water and the covalent linkage involving the mutated amino acid are of particular interest. Also, the analysis of the Arg-96 to His mutation with respect to the interactions with the C-terminal end of a helix (residues 82-90) indicates that the nearby carbonyl groups (Tyr-88 and Asp-89) make the dominant contribution, that the amide groups do not contribute significantly and that the helix dipole model is inappropriate for this case. For the non-polar-to-non-polar amino acid mutation in barnase, the solvent contribution is unimportant, and covalent terms are shown to be significant because they do not cancel between the folded and unfolded state.
Using the newly available refined co-ordinates of deoxy and oxyhaemoglobin, we have re-examined and compared the interfaces between the dimers alpha 1 beta 1 and alpha 2 beta 2. The most extensive monomer-monomer contacts are between alpha 1 and beta 2, and, symmetrically, alpha 2 and beta 1. In oxyhaemoglobin these interfaces bury 700 A2 less protein surface than in deoxyhaemoglobin. The alpha 1 alpha 2 interface involves similar salt bridges in both forms, but in oxyhaemoglobin buries 240 A2 more surface than in deoxyhaemoglobin. There is a loosely packed beta 1 beta 2 interface burying 320 A2 of surface in oxyhaemoglobin; there is no beta 1 beta 2 interface in deoxyhaemoglobin. The greater stability of the deoxy form, in the absence of ligands, can be attributed to a combination of hydrophobic, van der Waals' and electrostatic interactions.
We propose an analytical substitute to the geometrical construction that is commonly used in calculating the protein surface area that is accessible to the solvent. A statistical approach leads to an expression of accessible surface areas as a function of distances between pairs of atoms or of residues in the protein structure, assuming only that these atoms or residues are randomly distributed in space but not penetrating each other. This function gives good estimates of the accessible surface area and of the area buried in subunit contacts for a number of proteins. Its evaluation is very fast, and the function can be differentiated, which opens the way to new applications of accessibility measurements in the study of proteins. As an example, we show that the presence of domains is easily detected by an automatic procedure based on surface areas only.