Non-bonded atoms (greater than two bonds apart) interact through van der Waals attraction, steric repulsion, and electrostatic attraction/repulsion. These properties are easiest to describe mathematically when atoms are considered as spheres of characteristic radii.
The object of molecular mechanics is to predict the energy associated with a given conformation of a molecule. However, molecular mechanics energies have no meaning as absolute quantities. Only differences in energy between two or more conformations have meaning. A simple molecular mechanics energy equation is given by: Energy = Stretching Energy + Bending Energy + Torsion Energy + Non-Bonded Interaction Energy
These equations together with the data (parameters) required to describe the behavior of different kinds of atoms and bonds, is called a force-field. Many different kinds of force-fields have been developed over the years. Some include additional energy terms that describe other kinds of deformations. Some force-fields account for coupling between bending and stretching in adjacent bonds in order to improve the accuracy of the mechanical model.
The mathematical form of the energy terms varies from force-field to force-field. The more common forms will be described.
The stretching energy equation is based on Hooke's law. The "k b " parameter controls the stiffness of the bond spring, while "r o " defines its equilibrium length. Unique "k b " and "r o " parameters are assigned to each pair of bonded atoms based on their types (e.g. C-C, C-H, O-C, etc.). This equation estimates the energy associated with vibration about the equilibrium bond length. This is the equation of a parabola, as can be seen in the following plot:
Notice that the model tends to break down as a bond is stretched toward the point of dissociation.
The bending energy equation is also based on Hooke's law. The "k theta " parameter controls the stiffness of the angle spring, while "theta o " defines its equilibrium angle. This equation estimates the energy associated with vibration about the equilibrium bond angle:
Unique parameters for angle bending are assigned to each bonded triplet of atoms based on their types (e.g. C-C-C, C-O-C, C-C-H, etc.). The effect of the "k b " and "k theta " parameters is to broaden or steepen the slope of the parabola. The larger the value of "k", the more energy is required to deform an angle (or bond) from its equilibrium value. Shallow potentials are achieved for "k" values between 0.0 and 1.0. The Hookeian potential is shown in the following plot for three values of "k":
The torsion energy is modeled by a simple periodic function, as can be seen in the following plot:
The torsion energy in molecular mechanics is primarily used to correct the remaining energy terms rather than to represent a physical process. The torsional energy represents the amount of energy that must be added to or subtracted from the Stretching Energy + Bending Energy + Non-Bonded Interaction Energy terms to make the total energy agree with experiment or rigorous quantum mechanical calculation for a model dihedral angle (ethane, for example might be used a a model for any H-C-C-H bond).
The "A" parameter controls the amplitude of the curve, the n parameter controls its periodicity, and "phi" shifts the entire curve along the rotation angle axis (tau). The parameters are determined from curve fitting. Unique parameters for torsional rotation are assigned to each bonded quartet of atoms based on their types (e.g. C-C-C-C, C-O-C-N, H-C-C-H, etc.). Torsion potentials with three combinations of "A", "n", and "phi" are shown in the following plot:
Notice that "n" reflects the type symmetry in the dihedral angle. A CH3-CH3 bond, for example, ought to repeat its energy every 120 degrees. The cis conformation of a dihedral angle is assumed to be the zero torsional angle by convention. The parameter phi can be used to synchronize the torsional potential to the initial rotameric state of the molecule whose energy is being computed.
The non-bonded energy accounts for repulsion, van der Waals attraction, and electrostatic interactions. van der Waals attraction occurs at short range, and rapidly dies off as the interacting atoms move apart by a few Angstroms. Repulsion occurs when the distance between interacting atoms becomes even slightly less than the sum of their contact radii. Repulsion is modeled by an equation that is designed to rapidly blow up at close distances (1/r^12 dependency). The energy term that describes attraction/repulsion provides for a smooth transition between these two regimes. These effects are often modeled using a 6-12 equation, as shown in the following plot:
The "A" and "B" parameters control the depth and position (interatomic distance) of the potential energy well for a given pair of non-bonded interacting atoms (e.g. C:C, O:C, O:H, etc.). In effect, "A" determines the degree of "stickiness" of the van der Waals attraction and "B" determines the degree of "hardness" of the atoms (e.g marshmallow-like, billiard ball-like, etc.).
The "A" parameter can be obtained from atomic polarizability measurements, or it can be calculated quantum mechanically. The "B" parameter is typically derived from crystallographic data so as to reproduce observed average contact distances between different kinds of atoms in crystals of various molecules.
The electrostatic contribution is modeled using a Coulombic potential. The electrostatic energy is a function of the charge on the non-bonded atoms, their interatomic distance, and a molecular dielectric expression that accounts for the attenuation of electrostatic interaction by the environment (e.g. solvent or the molecule itself). Often, the molecular dielectric is set to a constant value between 1.0 and 5.0. A linearly varying distance-dependent dielectric (i.e. 1/r) is sometimes used to account for the increase in environmental bulk as the separation distance between interacting atoms increases.
Partial atomic charges can be calculated for small molecules using an ab initio or semiempirical quantum technique (usually MOPAC or AMPAC). Some programs assign charges using rules or templates, especially for macromolecules. In some force-fields, the torsional potential is calibrated to a particular charge calculation method (rarely made known to the user). Use of a different method can invalidate the force-field consistency.
Richard Feynman, recipient of the 1965 Nobel Prize in Physics, once famously stated: 'If we were to name the most powerful assumption of all, which leads one on and on in an attempt to understand life, it is that all things are made of atoms, and that everything that living things do can be understood in terms of the jigglings and wigglings of atoms.' Much of the biophysics of the last 50 years has been dedicated to better understanding the nature of this atomic jiggling and wiggling. The quantum-mechanical laws governing motions in the microscopic world are surprisingly foreign to those familiar with macroscopic dynamics. Motions are governed not by deterministic laws, but by probability functions chemical bonds are formed not mechanically, but by shifting clouds of electrons that are simultaneously waves and particles. As Feynman eloquently put it, this is 'nature as she is - absurd' .
Understanding these absurd molecular motions is undoubtedly germane to drug discovery. The initial 'lock-and-key' theory of ligand binding , in which a frozen, motionless receptor was thought to accommodate a small molecule without undergoing any conformational rearrangements, has been largely abandoned in favor of binding models that account not only for conformational changes, but also for the random jiggling of receptors and ligands [3–7].
The mollusk acetylcholine binding protein (AChBP), a structural and functional surrogate of the human nicotinic acetylcholine receptor (AChR) ligand-binding domain [8–11], is one of countless examples that illustrate the importance of accounting for these atomistic motions (Figure 1). In crystallographic structures of small-molecule AChR agonists bound to AChBP, a key loop (loop C) partially closes around the ligand (Figure 1a,c). In contrast, crystal structures of large AChR antagonists like snake α-neurotoxins bound to AChBP reveal that this same loop is displaced by as much as 10 Å, producing an active site that is far more open (Figure 1b,c) . Bourne et al.  proposed that the unbound AChBP and AChR are highly dynamic proteins that, in the absence of a ligand, sample many conformational states, both open and closed, that are selectively stabilized by the binding of agonists and antagonists. Any one of these binding-pocket conformations may be druggable and therefore pharmacologically relevant. If this general model of ligand binding is correct, the implications for structure-based drug design are profound, as the same principle of binding likely applies to many other systems as well.
The different conformations of the acetylcholine binding protein from Lymnaea stagnalis. Portions of the protein have been removed to facilitate visualization. (a) The protein in a closed conformation with nicotine bound (PDB ID: 1UW6), shown in blue. (b) The protein in an open conformation (PDB ID: 1YI5) with the same nicotine conformation superimposed on the structure, shown in pink. (c) Ribbon representations of the two conformations.
Work in the Karplus group is supported in part by a grant from the NIH. Work in the McCammon group is supported in part by the NSF, NIH, the W.M. Keck Foundation, the National Biomedical Computation Resource and the HHMI. As is evident from the references, our co-workers have contributed much of the work summarized here. We apologize to the large number of scientists whose important contributions to molecular dynamics simulations could not be cited because of space limitations on this short review.
Experimental studies involving the carcinogenic aromatic amine 2-(acetylamino)fluorene (AAF) have afforded two acetylated DNA adducts, the major one bound to C8 of guanine and a minor adduct bound to N 2 of guanine. The minor adduct may be important in carcinogenesis because it persists, while the major adduct is rapidly repaired. Primer extension studies of the minor adduct have indicated that it blocks DNA synthesis, with some bypass and misincorporation of adenine opposite the lesion [Shibutani, S., and Grollman, A. P. (1993) Chem. Res. Toxicol.6, 819−824]. No experimental structural information is available for this adduct. Extensive minimized potential energy searches involving thousands of trials and molecular dynamics simulations were used to study the conformation of this adduct in three sequences: I, d(C1-G2-C3-[AAF]G4-C5-G6-C7)·d(G8-C9-G10-C11-G12-C13-G14) II, the sequence of Shibutani and Grollman, d(C1-T2-A3-[AAF]G4-T5-C6-A7)·d(T8-G9-A10-C11-T12-A13-G14) and III, which is the same as II but with a mismatched adenine in position 11, opposite the lesion. AAF was located in the minor groove in the low-energy structures of all sequences. In the lowest energy form of the C3-[AAF]G4-C5 sequence I, the fluorenyl rings point in the 3‘ direction along the modified strand and the acetyl in the 5‘ direction. These orientations are reversed in the second lowest energy structure of this sequence, and the energy of this structure is 1.4 kcal/mol higher. Watson−Crick hydrogen bonding is intact in both structures. In the two lowest energy structures of the A3-[AAF]G4-T5 sequence II, the AAF is also located in the minor groove with Watson−Crick hydrogen bonding intact. However, in the lowest energy form, the fluorenyl rings point in the 5‘ direction and the acetyl in the 3‘ direction. The energy of the structure with opposite orientation is 5.1 kcal/mol higher. In sequence III with adenine mismatched to the modified guanine, the lowest energy form also had the fluorenyl rings oriented 5‘ in the minor groove with intact Watson−Crick base pairing. However, the mispaired adenine adopts a syn orientation with Hoogsteen pairing to the modified guanine. These results suggest that the orientation of the AAF in the minor groove may be DNA sequence dependent. Mobile aspects of favored structures derived from molecular dynamics simulations with explicit solvent and salt support the essentially undistorting nature of this lesion, which is in harmony with its persistence in mammalian systems.
Department of Biology, New York University.
Department of Chemistry, New York University.
Oak Ridge National Laboratory.
Corresponding author: tel, (212) 998-8231 fax, (212) 995-4015 e-mail, [email protected]
- Molecular Medicine
- Drug Discovery
- Organic Chemistry
Research output : Contribution to journal › Article › peer-review
T1 - Identifying the Molecular Mechanics and Binding Dynamics Characteristics of Potent Inhibitors to HIV-1 Protease
N1 - Copyright: Copyright 2013 Elsevier B.V., All rights reserved.
N2 - Human immunodeficiency virus type 1 protease (HIV-1 PR) is one of the primary inhibition targets for chemotherapy of AIDS because of its critical role in the replication cycle of the HIV. In this work, a combinatory coarse-grained and atomistic simulation method was developed for dissecting molecular mechanisms and binding process of inhibitors to the active site of HIV-1 PR, in which 35 typical inhibitors were trialed. We found that the molecular size and stiffness of the inhibitors and the binding energy between the inhibitors and PR play important roles in regulating the binding process. Comparatively, the smaller and more flexible inhibitors have larger binding energy and higher binding rates they even bind into PR without opening the flaps. In contrast, the larger and stiffer inhibitors have lower binding energy and lower binding rate, and their binding is subjected to the opening and gating of the PR flaps. Furthermore, the components of binding free energy were quantified and analyzed by their dependence on the molecular size, structures, and hydrogen bond networks of inhibitors. Our results also deduce significant dynamics descriptors for determining the quantitative structure and property relationship in potent drug ligands for HIV-1 PR inhibition. Correlation between the association rate constant and the radius of gyration of inhibitors. The dash black lines is the trend line and the dash red circle highlights the confined region of FDA-approved inhibitors (solid pink-color dots). The inset shows the interaction between the inhibitor and the PR during the binding process.
AB - Human immunodeficiency virus type 1 protease (HIV-1 PR) is one of the primary inhibition targets for chemotherapy of AIDS because of its critical role in the replication cycle of the HIV. In this work, a combinatory coarse-grained and atomistic simulation method was developed for dissecting molecular mechanisms and binding process of inhibitors to the active site of HIV-1 PR, in which 35 typical inhibitors were trialed. We found that the molecular size and stiffness of the inhibitors and the binding energy between the inhibitors and PR play important roles in regulating the binding process. Comparatively, the smaller and more flexible inhibitors have larger binding energy and higher binding rates they even bind into PR without opening the flaps. In contrast, the larger and stiffer inhibitors have lower binding energy and lower binding rate, and their binding is subjected to the opening and gating of the PR flaps. Furthermore, the components of binding free energy were quantified and analyzed by their dependence on the molecular size, structures, and hydrogen bond networks of inhibitors. Our results also deduce significant dynamics descriptors for determining the quantitative structure and property relationship in potent drug ligands for HIV-1 PR inhibition. Correlation between the association rate constant and the radius of gyration of inhibitors. The dash black lines is the trend line and the dash red circle highlights the confined region of FDA-approved inhibitors (solid pink-color dots). The inset shows the interaction between the inhibitor and the PR during the binding process.
Materials and Methods
HIV-1 PR structures and the PR–inhibitor complexes
For the HIV-1 PR structures and PR–inhibitor complexes, we have carefully examined Protein Data Bank (PDB) and Drug Bank ( 35-38 ) for the PR structures comprising chemically diverse ligands and binding activities. In this work, we designed a data set of totally 35 potent ligands/inhibitors with different types of scaffolds and chemistry (see Table 1 and Supporting Information). These ligands include nine Food and Drug Administration (FDA)-approved inhibitors that currently used in clinical treatment, six analogs of BEA268, seven cyclic urea inhibitors, two cyclic sulfamide inhibitors, six substrates with different sequences, and five other leading inhibitors. The selection of these 35 potent ligands was to incorporate the most important inhibitors that reflected the diversity of the large data set. The criterion of the selection is that these ligands should have both crystal structure and experimental results of the association rate. Details of structures and chemistry of these inhibitors are listed in Table S2 and Figures S1–S6 in Supplementary Information. The inhibitor-bound complex structures were retrieved from Protein Data Bank with respective PDB codes, see Table 1. Both catalytic Asp side chains of the protease were modeled in the non-protonated state according to the solution at pH 7.
|P1/P1′ analogs of BEA268 a|
|P2/P2′ and central hydroxy analogs of BEA268 a|
|Cyclic urea compounds|
|Cyclic sulfamide compounds|
|Modified p2-NC-substrate c||1KJ7 c|
- a The structure of BEA268 can be found in reference ( 34 ). Details of the chemical structure of the inhibitors used in this study can be found in supplementary Table S2, Figures S1 and S6.
- b The structure of AHA008-bound complex is constructed with AHA001- and DMP323-bound complexes details can be found in supplementary Figure S7.
- c The modified p2-NC-substrate is also a peptide substrate with cleavage site p2-NC, whose structure is obtained by modifying PDB code 1KJ7. The sequence of the modified p2-NC-substrate can be found in supplementary Table S2.
Coarse-grained dynamics simulations of HIV-1 PR and inhibitors
To derive the binding free energy, dynamics, and pathways of inhibitors in HIV-1 PR over the long-time dynamics simulation, we adopted and developed a CG dynamics simulation based on the CG model proposed by McCammon and co-workers ( 16-21 ). We have tested the CG method by investigating the binding processes of several typical inhibitors to HIV-1 PR ( 23-25 ).
where Ubond, Uangle, and Udihedral are bond, angle, and dihedral interaction functions, respectively is the electrostatic interaction function and are the local and non-local non-bonded interaction functions, respectively.
For the inhibitors, we represented the grouped atoms of the molecule by one CG bead, as reported in our previous works ( 23-25 ). The CG beads were connected by bond, angle, and dihedral angle potentials, which took harmonic formats (the parameters of the potentials are shown in Table S1 of Supporting Information).
where is the radius of gyration of CG bead i and is the average vdW radii of CH/CH2/CH3 groups that are located at the most outside of the CG beads. is equal to 1.925 Å ( 40 ). More details of the CG force field for PR–inhibitor complex can also be referred from our previous studies ( 23-25 ).
The binding free energy of inhibitor to PR
where SASA that was calculated by the msms package ( 46 ) with a 1.4 Å radius probe sphere γ and η are constants set to 0.00542 kcal/(mol Å 2 ) and 0.92 kcal/mol ( 30, 47 ), respectively.
Atomistic molecular dynamics simulations
All-atom MD simulations were performed with the gromacs programs ( 48, 49 ) using the force field of ffamber99 ( 50 ). The all-atom force-field parameters of inhibitors were determined by the ANTECHAMBER and GAFF module ( 51, 52 ) with AM1-BCC ( 53 ) charges ( 54 ). The HIV-1 PR and bounded inhibitor complexes were solvated in an 90 × 80 × 80 Å 3 TIP3P ( 55 ) water box. Appropriate Na and Cl ions were added to neutralize the system. Particle mesh Ewald ( 56 ) was used to calculate the long-range electrostatic interactions. The systems were firstly minimized and gradually heated to 300 K and equilibrated for at least 200 pseconds. Positional restraints were firstly used and the restraint force constants were latter ramped down from 2.39 kcal/(mol Å 2 ) to 0 kcal/(mol Å 2 ) in a staged equilibration. All production simulations were performed at 300K at least for 1 ns with a Berendsen controlled ( 57 ) pressure of 1 bar. The SHAKE algorithm ( 58 ) was applied to constrain the bonds with H-atoms. The time step of the simulations was 2.0 fseconds, and the cut-off of the non-bonded interactions was set to 10 Å. The 1-nanosecond simulation trajectories were saved as 500 frames, and the last 250 snapshots were used for dynamics, energetics, and MM/PBSA calculations. All molecular visualization and trajectory analysis were processed using the vmd program ( 59 ).
Calculation of radius of gyration (R G ) and hydrogen bonds
To determine promising molecular mechanics and dynamics descriptors for the quantitative structure and property relationship (QSPR), our focuses were narrowed down to two key properties: the radius of gyration and hydrogen bond networks.
where mi is the mass of atom i and ri the position of atom i with respect to the center of mass of the molecule.
For analyzing H-bond, we excluded the H-bonds formed by interwater molecules, interwater and PR, and intra-PR and inhibitors, and thus we obtained the H-bonds formed by PR and inhibitors. For determining the hydrogen bonds (H-bond) formed by donor(s) and acceptor(s), a geometrical criterion was set by both atom distance and bond orientation namely, the combination of donor D atoms, hydrogen H, and acceptor A atoms with a D-H···A configuration was regarded as a hydrogen bond when the distance between donor D and acceptor A was shorter than 3.5 Å as well as the bond angle H-D···A was smaller than 60.0°.
Molecular Mechanics and Dynamics - Biology
By far, the most popular application of the empirical potential energy function is to find the geometry of a molecule (or an assemblage of molecules) which corresponds to a minimum potential energy. The process of finding a minimum of an empirical potential energy function is called frequently Molecular Mechanics (MM) . This process produces a motionless molecule of idealized geometry. In reality, the molecules undergo thermal motions and constantly change their geometry. Molecular Dynamics, (MD) , and Monte Carlo, (MC) are based on principles of statistical mechanics and allow calculation of thermodynamic properties of molecular systems by simulating these thermal fluctuations of geometry and corresponding energies. These methods will be described briefly later in this chapter. For a review of molecular mechanics consult Burkert & Allinger, (1982).
The potential energy function refers to a ground electronic state of the molecule and does not explicitly incorporate electronic structure. Hence, it cannot be used to study electronic spectra or covalent bond formation/breaking. These phenomena are studied with quantum methods. There are methods which combine a quantum description for the small portion of the molecule (e.g., groups of substrate and enzyme interacting in the active center) with a molecular mechanics description for the rest of the molecule (see e.g., Field et al. , 1990).
The minimization of the potential energy function (i.e., geometry optimization) involves a search for the minimum of a function, and, to be efficient, requires calculations of derivatives of a function (in this case, the potential energy) versus independent variables (in our case, coordinates). Most programs use cartesian coordinates as independent variables, however, in some cases, internal coordinates may be used. The derivatives of potential energy are denoted as:
- Search Methods -- utilizes only values of the function itself. They are usually slow and inefficient, but are very simple to program, since deriving cumbersome formulas for derivatives is not necessary. In spite of their inefficiency, the search algorithms are infalliable and always find a minimum. For this reason, they are often used as an initial step, when the starting point in optimization is far from the minimum. Another disadvantage of search techniques is that they are very inefficient for a large number of optimized variables and converge very slowly when the number of variables is more then 10. The most popular algorithm in this class is called SIMPLEX.
- Gradient Methods -- utilizes values of a function and its gradients. These are currently the most popular methods in molecular mechanics. They offer a much better convergence rate than search methods and do not require a lot of computer memory (only 3 N first derivatives are needed). However, in some situations they fail to converge to a minimum. The conjugated gradient algorithm is considered the most robust in this class.
- Newton Methods -- are the most rapidly converging algorithms which require values of function, and its first and second derivatives. The memory required for storing the Hessian matrix is proportional to (i.e., prohibitive for large macromolecules). The BFGS algorithm is considered the most refined one.
In general, the minimization methods are iterative. They require on input some initial estimate for the position of the minimum, and provide a better estimate for the minimum as a result. This corrected estimate is used as an input into the next cycle (i.e., iteration) and the process is continued until there is no significant improvement in the position of the minimum.
Figure 6.23: Local and global minima for a function of one variable and an example of a sequence of solutions: for a descent series minimization algorithm.
Most search methods and minimization methods using derivatives are the descent series methods , i.e., each iteration results in a solution which corresponds to a lower (or equal) value for the energy function:
As a consequence, these methods can only find the minimum closest to the starting estimate and will never cross to a minimum (however deep) if it is separated from the starting estimate by a maximum (however small). This situation is schematically illustrated in Fig. 6.23 where arrows indicate the direction of geometry optimization depending upon the starting point. Fig. 6.23 is only a cartoon to illustrate the behavior of descent series minimization methods. Geometry optimization of real molecular systems (e.g. a protein with a drug molecule bound) involves simultaneous optimization of 3 N cartesian coordinates, i.e, sometimes many thousands of variables. There is no general way of finding a global minimum (i.e., the minimum corresponding to the lowest possible value of the function). A different initial geometry will usually lead to a different final minimum. Only on very simple molecules will the single geometry optimization yield the global minimum on the first trial. It cannot be overemphasized that the result of a single minimization is usually a local minimum, not a global one. To find a global minimum (or at least, to be more confident about it) you have to perform many minimizations and use different initial coordinates for each run. There is another practical difficulty with contemporary molecular mechanics programs. Most of them optimize the geometry in atomic cartesian coordinates. To illustrate the problem let us join Alice on her journey to the other side of the mirror. The appearance of the potential energy function in cartesian space is dramatically different from the one given by internal coordinates. If we could see a cartesian representation of a potential energy surface in, let us say, fifty dimensions , the picture would look like the cortex of a human brain -- lots of narrow tortuous valleys of similar depth. This is because low energy paths for individual atoms are very narrow due to the presence of hard bond stretching and angle bending terms. The low energy paths correspond only to the rotation of groups or large portions of the molecule as a result of varying torsional angles. In other words, the low energy paths for atoms are strongly correlated in cartesian space and atoms can move easily only along narrow grooves. On the other hand, the outlook of the potential energy surface is very different in the space of internal coordinates. The surface looks like a valley surrounded by high mountains. The high peaks correspond to stretching and bending terms, and close van der Waals contacts, while the bottom of the valley represents the torsional degrees of freedom. If you happen to start at the mountain tops in the internal coordinate space, the minimizer sees the bottom of the valley clearly from above. If you are in the valley, you also see where the mountains are. In cartesian space, the minimizer walks along the bottom of a narrow winding channel which is frequently a dead-end.
In more scientific terms, all contemporary function minimization procedures operate efficienty for functions close to quadratic (i.e., for two dimensions a parabola, for three dimensions a boat like shape, for n dimensions: ) and are inefficient for functions of other shapes. Assuming that bond lengths and valence angles are already optimized and the goal of minimization is to find the minimum corresponding to optimal values of torsional angles, cartesian space is probably the worst possible. Look at equations 6.16, 6.27 and 6.9, which relate the torsional energy to cartesian coordinates. The dependance of the torsional energy on cartesian coordinates is as far from a quadratic shape as it can be! Moreover, it has several minima. There are also additional advantages to using minimization in internal coordinate space. In this space there is a clear separation of variables into the hard ones (i.e., those whose small changes produce large changes in the function value) and soft ones (i.e., those whose changes do not affect the function value substantially). During function optimization in internal coordinates, the minimizer first optimizes the hard variables and in the subsequent iterations ``cleans up'' the details by optimizing the soft variables. In cartesian coordinate space all variables are of the same type, i.e., hard or soft depending on preference. Changing a cartesian coordinate of an atom results in simultaneous change of bond lengths, valence angles and torsonal angles associated with this atom, and distances to non-bonded atoms. For this reason, the minimizer is only involved with details, and therefore in the last stages of the optimization all variables are as important as at the beginning. The reason why most programs optimize geometry in cartesian space is that it is not easy to derive and manage expressions for derivatives of potential energy function in internal coordinates. Note that the derivative of a van der Waals energy term between two atoms versus the cartesian coordinates of some third atom is zero. This derivative depends only on the cartesian coordinates of the two atoms involved. The situation is dramatically different in internal coordinate space. Since the distance between two atoms depends on all bond lengths, bond angles and torsional angles on the path between two atoms, the derivative of the van der Waals term for the two atoms will depend on all these variables. There are matrix methods of calculating derivatives of distances and angles as contributions from individual internal coordinates, and the final derivative is calculated from the chain rule using these parameters. But as you can see, the problem is far from trivial. To avoid evaluation of derivatives, some programs use search techniques like SIMPLEX when optimization is requested in internal coordinate space. Since SIMPLEX is inefficient for large numbers of variables, usually only torsional angles are optimized by this technique. If the number of variables is too large, then a block technique is used where only a portion of all independent variables is optimized at a given stage. Another reason why internal coordinates are not used routinely for geometry optimization is that modern molecular mechanics programs are usually a portion of a molecular mechanics/dynamics package. As will be shown later in this chapter, molecular dynamics in coordinates other than cartesian is a difficult problem, since the terms containing generalized momenta and generalized coordinates in the Hamiltonian cannot be separated. Molecular mechanics and dynamics share large portions of the actual code, and hence there is a preference by software authors to use the space which is common to both methods.
The potential energy of the molecule calculated from a well balanced force field represents a strain in the molecule. Augmented with bond/group equivalents and statistical mechanical corrections, it can be used to estimate the heat of formation of a compound (i.e., its molar enthalpy of formation, ). The quantity can be used to compare the relative stability of different compounds. In most cases however, you should assume that the calculated potential energy incorporates some arbitrary component which depends upon the types of atoms and covalent bonds in the molecule, i.e., comparison of the energies calculated for molecules with different numbers and/or types of atoms and bonds cannot be rigorous. For this reason, potential energy will in most cases reliably assess the difference in energy between conformers, but will fail if you attempt to calculate the energetics of incorporating a new functional group into the molecule. Molecular mechanics can also provide the interaction energy, , of two molecules A and B as:
where , , and are potential energies of the optimized complex, the optimized molecule A , and the optimized molecule B respectively. Note that the type and number of atoms and covalent bonds in the complex AB is equal to their sum in isolated molecules A and B , and the arbitrary ``energy zero'' should cancel out in this case. For this reason, the difference between interaction energies calculated for different complexes, (i.e., ) is the preferred method over direct comparison of the energies of different complexes (i.e., ).
Potential energy functions can also be used to estimate contributions from intramolecular vibrations to the vibrational free energy and vibrational entropy. These quantities, and contributions from translation and rotation of the molecule as a whole, vary with temperature and are main contributors to the thermodynamic functions such as enthalpy, free energy, specific heat, etc., (see, e.g., Hill, 1960). A well known approach is to use the frequencies, , corresponding to normal modes within harmonic approximation (i.e, to derive them from a mass scaled Hessian matrix at energy minimum). The expressions for relating classical vibrational contributions to Helmholtz free energy, , internal energy, , heat capacity at constant volume , and entropy of the nonlinear molecule, are derived in many standard textbooks for statistical mechanics (see e.g., McQuarrie, 1976):
where R , T , and h are the gas constant, the absolute temperature, and Planck's constant respectively and N denotes the number of atoms in the molecule. Frequently, these values are augmented with a correction to account for vibrations at 0 K, which is of quantum origin, by adding a zero-point energy ( ZPE ) to the free energy and the internal energy :
3. Statistical Mechanics
Molecular dynamics simulations generate information at the microscopic level, including atomic positions and velocities. The conversion of this microscopic information to macroscopic observables such as pressure, energy, heat capacities, etc., requires statistical mechanics. Statistical mechanics is fundamental to the study of biological systems by molecular dynamics simulation. In this section, we provide a brief overview of some main topics. For more detailed information, refer to the numerous excellent books available on the subject.
Introduction to Statistical Mechanics:
In a molecular dynamics simulation, one often wishes to explore the macroscopic properties of a system through microscopic simulations, for example, to calculate changes in the binding free energy of a particular drug candidate, or to examine the energetics and mechanisms of conformational change. The connection between microscopic simulations and macroscopic properties is made via statistical mechanics which provides the rigorous mathematical expressions that relate macroscopic properties to the distribution and motion of the atoms and molecules of the N-body system molecular dynamics simulations provide the means to solve the equation of motion of the particles and evaluate these mathematical formulas. With molecular dynamics simulations, one can study both thermodynamic properties and/or time dependent (kinetic) phenomenon.
D. McQuarrie, Statistical Mechanics (Harper & Row, New York, 1976)
D. Chandler, Introduction to Modern Statistical Mechanics (Oxford University Press, New York, 1987)
Statistical mechanics is the branch of physical sciences that studies macroscopic systems from a molecular point of view. The goal is to understand and to predict macroscopic phenomena from the properties of individual molecules making up the system. The system could range from a collection of solvent molecules to a solvated protein-DNA complex. In order to connect the macroscopic system to the microscopic system, time independent statistical averages are often introduced. We start this discussion by introducing a few definitions.
The thermodynamic state of a system is usually defined by a small set of parameters, for example, the temperature, T, the pressure, P, and the number of particles, N. Other thermodynamic properties may be derived from the equations of state and other fundamental thermodynamic equations.
The mechanical or microscopic state of a system is defined by the atomic positions, q, and momenta, p these can also be considered as coordinates in a multidimensional space called phase space. For a system of N particles, this space has 6N dimensions. A single point in phase space, denoted by G , describes the state of the system. An ensemble is a collection of points in phase space satisfying the conditions of a particular thermodynamic state. A molecular dynamics simulations generates a sequence of points in phase space as a function of time these points belong to the same ensemble, and they correspond to the different conformations of the system and their respective momenta. Several different ensembles are described below.
An ensemble is a collection of all possible systems which have different microscopic states but have an identical macroscopic or thermodynamic state.
There exist different ensembles with different characteristics.
- Microcanonical ensemble (NVE) : The thermodynamic state characterized by a fixed number of atoms, N, a fixed volume, V, and a fixed energy, E. This corresponds to an isolated system.
- Canonical Ensemble (NVT): This is a collection of all systems whose thermodynamic state is characterized by a fixed number of atoms, N, a fixed volume, V, and a fixed temperature, T.
- Isobaric-Isothermal Ensemble (NPT): This ensemble is characterized by a fixed number of atoms, N, a fixed pressure, P, and a fixed temperature, T.
- Grand canonical Ensemble ( m VT): The thermodynamic state for this ensemble is characterized by a fixed chemical potential, m , a fixed volume, V, and a fixed temperature, T.
Calculating Averages from a Molecular Dynamics Simulation
An experiment is usually made on a macroscopic sample that contains an extremely large number of atoms or molecules sampling an enormous number of conformations. In statistical mechanics, averages corresponding to experimental observables are defined in terms of ensemble averages one justification for this is that there has been good agreement with experiment. An ensemble average is average taken over a large number of replicas of the system considered simultaneously.
In statistical mechanics, average values are defined as ensemble averages.
The ensemble average is given by
is the observable of interest and it is expressed as a function of the momenta, p, and the positions, r, of the system. The integration is over all possible variables of r and p.
The probability density of the ensemble is given by
where H is the Hamiltonian, T is the temperature, kB is Boltzmanns constant and Q is the partition function
This integral is generally extremely difficult to calculate because one must calculate all possible states of the system. In a molecular dynamics simulation, the points in the ensemble are calculated sequentially in time, so to calculate an ensemble average, the molecular dynamics simulations must pass through all possible states corresponding to the particular thermodynamic constraints.
Another way, as done in an MD simulation, is to determine a time average of A, which is expressed as
where t is the simulation time, M is the number of time steps in the simulation and A(pN,rN) is the instantaneous value of A.
The dilemma appears to be that one can calculate time averages by molecular dynamics simulation, but the experimental observables are assumed to be ensemble averages. Resolving this leads us to one of the most fundamental axioms of statistical mechanics, the ergodic hypothesis, which states that the time average equals the ensemble average.
The basic idea is that if one allows the system to evolve in time indefinitely, that system will eventually pass through all possible states. One goal, therefore, of a molecular dynamics simulation is to generate enough representative conformations such that this equality is satisfied. If this is the case, experimentally relevant information concerning structural, dynamic and thermodynamic properties may then be calculated using a feasible amount of computer resources. Because the simulations are of fixed duration, one must be certain to sample a sufficient amount of phase space.
Some examples of time averages:
Average potential energy
where M is the number of configurations in the molecular dynamics trajectory and Vi is the potential energy of each configuration.
Average kinetic energy
where M is the number of configurations in the simulation, N is the number of atoms in the system, mi is the mass of the particle i and vi is the velocity of particle i.
A molecular dynamics simulation must be sufficiently long so that enough representative conformations have been sampled.
Combined Quantum Mechanical and Molecular Mechanical Modelling of Biomolecular Interactions
2 The Principle
The molecular mechanics which is based on empirical potential energy functions has become a widely used concept since last one or two decades. There are a number of reasons to integrate MM with quantum mechanics as MM is a empirical function based approach, the accuracy of the approach is limited. On the other side, QM functions can potentially generate realistic potential energy surfaces, can include environment-dependent polarization effects and charge transfer interactions. The basic need of integrating both the methods is to account those electronic structure changes which are not considered by MM methods. The main objective of QM/MM-based approach is to describe the interaction between ligand molecule and receptor and the binding ability of the ligand with the receptor. Figure 2 shows a protein receptor molecule bound with its ligand molecule.
Figure 2 . Shows the partitioning of the protein–ligand complex into the QM applied region, MM applied region, and QM/MM applied regions.
The Hamiltonian operator is of fundamental importance for most of the quantum calculations and it corresponds to the total energy of the system. The Hamiltonian of the system according to Fig. 2 can be expressed as:
HQM = Hamiltonian accounting for all QM particles of ligand.
HMM = Hamiltonian accounting for all MM particles of protein.
HQM/MM = Hamiltonian accounting for the interaction between QM and MM particles within the system.
The van der Waal's interactions at molecular mechanics level can be described by simple functions like Lennard-Jones potential, while electrostatic term enters Fock matrix as self-consistent field (SCF) method. There are a large number of approaches to analyze the interaction between the two QM and MM systems. These approaches can broadly be divided into two categories subtractive QM/MM coupling and additive QM/MM coupling ( Menikarachchi & Gascon, 2010 ).