Amber Workshop - Tutorial 2

(Note: These tutorials are meant to provide illustrative examples of how to use the AMBER software suite to carry out simulations that can be run on a simple workstation in a reasonable period of time. They do not necessarily provide the optimal choice of parameters or methods for the particular application area.)
Copyright Ross Walker 2004

3) Running Minimisation and Molecular Dynamics (in vacuo)

This section will introduce sander and show how it can be used for minimisation and molecular dynamics of our previously created DNA models. We will initially run our simulations on the in vacuo model and analyse the results before moving on to running simulations with implicit and explicit solvent. For this section of the tutorial we shall be using the in vacuo prmtop and inpcrd files we created previously (polyAT_vac.prmtop and polyAT_vac.inpcrd).

This section of the tutorial will consist of 3 stages:

Relaxing the system prior to MD
Molecular dynamics at constant temperature
Analysing the results

3.1) Relaxing the System Prior to MD

In the previous section when we opened our pdb file in xleap we allowed it to add the missing hydrogens for us (which nucgen does not supply). These hydrogens are added by xleap based on the geometry specified in the residue database(s) (which define the original topology; for example, see $AMBERHOME/dat/leap/lib/all_nucleic94.lib). Since this "default" geometry may not correspond to the actual minima in the force field we are using and may also result in conflicts and overlaps with atoms in other residues, it is always a good idea to minimise the locations of these atoms before commencing molecular dynamics. Failure to successfully minimise these atoms may lead to instabilities when we run MD.

So, given the in vacuo prmtop (polyAT_vac.prmtop) and inpcrd (polyAT_vac.inpcrd) files we created, we will now use sander to conduct a short minimisation run. Since we just want to "fix up" the positions of the atoms in order to remove any bad contacts that may lead to unstable molecular dynamics we will run a short (500 steps) minimisation. This will take us towards the closest local minima. Minimisation with sander will only ever take you to the nearest minima, it cannot cross transition states to reach lower minima. This is fine for our purposes, however, since all we want to do is remove the largest strains in the system.

The basic usage for sander is as follows:

sander [-O] -i mdin -o mdout -p prmtop -c inpcrd -r restrt
[-ref refc] [-x mdcrd] [-v mdvel] [-e mden] [-inf mdinfo]

Arguments in []'s are optional
If an argument is not specified, the default name will be used.
-O overwrite all output files (the default behaviour is to quit if any output files already exist)
-i the name of the input file (which describes the simulation options), mdin by default.
-o the name of the output file, mdout by default.
-p the parameter/topology file, prmtop by default.
-c the set of initial coordinates for this run, inpcrd by default.
-r the final set of coordinates from this MD or minimisation run, restrt by default.
-ref reference coordinates for positional restraints, if this option is specified in the input file, refc by default.
-x the molecular dynamics trajectory file (if running MD), mdcrd by default.
-v the molecular dynamics velocities file (if running MD), mdvel by default.
-e a summary file of the energies (if running MD), mden by default.
-inf a summary file written every time energy information is printed in the output file for the current step of the minimisation of MD, useful for checking on the progress of a simulation, mdinfo by default.

3.1.1) Building the mdin input file

Now that we have the prmtop and inpcrd files from xleap, all we need to run sander is the mdin file which specifies the myriad of possible options for this run.

The run time input to control sander is via "namelist" variables (for more info see the manual) specified in the mdin file. For example:

Test run 1
&cntrl
IMIN = 1, NCYC = 250, MAXCYC = 500
/

In the absence of a variable specification in the input file, default values are chosen; every specified variable, except the last one, needs to be followed by a comma. The sander manual describes all of these inputs, for each of the possible namelists. Which namelist is used depends on the specification above, such as &cntrl or &ewald. At a minimum the &cntrl namelist must be specified. Also notice the space or empty first column before specification of the namelist control variable; this is necessary. It is also necessary to end each namelist with a forward slash /. Other namelists (such as the &ewald namelist above) are optional. After the namelist some other information may be specified, such as "GROUP" input, which allows atom selections for restraints. Note that the input variable and namelists have changed somewhat from earlier versions of AMBER. Refer to sander manual for more information.

Now, let's build up a minimal input file for performing minimisation of our DNA. In theory it would probably be best to run a dual stage minimisation where we initially use position restraints on all the heavy atoms so that in the first stage of minimisation only the hydrogens that xleap added are minimised. Then in the second stage we allow minimisation of all atoms in the system. Since our system is fairly small and simplistic it should be fine to skip the first stage and just minimise everything. An example of such a two stage minimisation approach will be given when we run simulations on our more complex solvated model in the next section.

So, to create our input file: To turn on minimisation, we specify IMIN = 1. We want a fairly short minimisation since we don't actually need to reach the minima, just move away from any local maxima, so we select 500 steps of minimisation by specifying MAXCYC = 500. Sander supports two different algorithms for minimisation, steepest descent and conjugate gradient. The steepest descent algorithm is good for quickly removing the largest strains in the system but converges slowly when close to a minima. Here the conjugate gradient method is more efficient. The use of these two algorithms can be controlled using the NCYC flag. If NCYC < MAXCYC sander will use the steepest descent algorithm for the first NCYC steps before switching to the conjugate gradient algorithm for the remaining (MAXCYC - NCYC) steps. In this case we will run an equal number of steps with each algorithm so we set NCYC = 250. Since sander assumes that the system is periodic by default we need to explicitly turn this off (NTB = 0). In this simulation we will be using a constant dielectric and not an implict (or explicit) solvent model so we set IGB = 0 (no generalised born solvation model), this is the default so we strictly don't need to specify this but I will include it here so that we can see what differences in the input file we have when we switch on implicit solvent later. We also need to choose a value for the non-bonded cut off. A larger cut off introduces less error in the non-bonded force evaluation but increases the computational complexity and thus calculation time. 12 angstroms is a good tradeoff so that is what we will use (CUT = 12). Here is what the input file looks like:

polyA-polyT 10-mer: initial minimisation prior to MD
&cntrl
imin   = 1,
maxcyc = 500,
ncyc   = 250,
ntb    = 0,
igb    = 0,
cut    = 12
/

So, now to run sander for minimisation.

3.1.2) Running sander for the first time

To run sander, we simply execute the following:

$AMBERHOME/exe/sander -O -i polyAT_vac_init_min.in -o polyAT_vac_init_min.out -c polyAT_vac.inpcrd -p polyAT_vac.prmtop -r polyAT_vac_init_min.rst

Input files: polyAT_vac_init_min.in, polyAT_vac.inpcrd, polyAT_vac.prmtop
Output files: polyAT_vac_init_min.out, polyAT_vac_init_min.rst

This should run very quickly. On a 3GHz Pentium 4 machine it takes about 4.3 seconds

Take a look at the output file produced during the minimisation (polyAT_vac_init_min.out). You will see that the energy dropped considerably between the first and last steps:

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1       9.2064E+01     4.9542E+01     6.8763E+02     O5'       321

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
    500      -2.2929E+03     5.7810E-01     3.4109E+00     N1        362

In spite of this, however, the structure did not change very much. This is because, as already mentioned, minimisation will only find the nearest local minima. If you create PDB files for the start (polyAT_vac.inpcrd) and final structures (polyAT_vac_init_min.rst) and compare the root-mean-squared deviations (RMSd), you will see that the two structures differ by only about 0.5 angstroms (all atom).

Initial structure = Green, Minimised structure = Blue

3.1.3) Creating PDB files from the AMBER coordinate files

You may want to generate a new pdb file so you can look at the structure using the minimised coordinates from polyAT_vac_init_min.rst. A pdb file can be created from the parm topology and coordinates (inpcrd or restrt) using the program ambpdb.

$AMBERHOME/exe/ambpdb -p polyAT_vac.prmtop < polyAT_vac_init_min.rst > polyAT_vac_init_min.pdb

This will take the prmtop and inpcrd file specified (in this case the final structure from the minimisation, the .rst file) and create (on stdout) a pdb file. In this case we redirected stdout to the file - polyAT_vac_init_min.pdb.

In general, users should carefully inspect any starting structures and minimised structures; specifically check to make sure the hydrogens were placed where you thought they should be, histidines are in the correct protonation state, the terminal residues are properly terminated, stereochemistry is reasonable, etc. There is nothing worse than finding out after you've run a nanosecond of solvated dynamics that an H1' atom was on the wrong side and that you have simulated some strange anomer of DNA!

3.2) Running MD in-vacuo

In section 3.1 we used sander to minimise our system in order to remove any bad contacts introduced by the hydrogenation step in xleap. This led to the creation of the coordinate file named polyAT_vac_init_min.rst. For the next section of this tutorial we will use this coordinate file as the starting structure for our in-vacuo MD simulation. The following in-vacuo simulation is designed to give a flavour of how MD simulations are run. In general one would not run an in-vacuo simulation unless the system was inherently gas phase. For liquid phase systems, such as our DNA, one would normally include solvent either explicitly or implicitly. This type of simulation will be covered in section 4. For the time being we will stick to gas phase simulations since they are simple and quick to run.

To make the calculations tractable for the purposes of this tutorial, we can only run short simulations of the order of 100 ps. Although these are "short" simulations (in terms of what is likely required to answer specific research questions), you will see that they are still rather costly, depending on what type of machine you using. However, to get a better idea of the issues involved and potential artefacts of the various models that will be used (for the simulation of DNA) it is probably useful to run through these examples, either running them yourself or just looking at the output files and trajectories supplied. You may also want to try some different examples from the two given below, such as a distance dependent dielectric constant, or see what happens if you increase the dielectric constant to say 80.0.

For this tutorial we will run two in-vacuo simulations and compare the results. The simulations we will run will be:

polyAT_vac_md1_12Acut.in: 12.0 angstrom long range cutoff, dielectric = 1

polyAT_vac_md1_nocut.in: no long range cutoff, dielectric = 1

To run a molecular dynamics simulation with sander, we need to turn off minimisation (IMIN=0). Since we are running in-vacuo we also need to disable periodicity (NTB=0) and set IGB=0 since we are not using implicit solvent. For these two examples we will write information to the output file and trajectory coordinates file every 100 steps (NTPR=100, NTWX=100). For most simulations writing to the coordinate file every 100 steps is too frequently, it both consumes excess disk space and can impact performance, however, this simulation here will show some interesting behavior over a short time scale that I would like you to see and so we will use a value of 100 to ensure that this is suitably sampled. For multiple nano-second simulations of stable systems a more suitable value for NTWX would be in the 1000 to 2000 range. The CUT variable specifies the cut off range for the long range non-bonded interactions. Here two different values for the cutoff will be used, one run will be with a cut off of 12 angstroms (CUT = 12.0) and one run will be without a cutoff. To run without a cutoff we simply set CUT to be larger than the extent of the system (e.g. CUT = 999). For temperature regulation we will use the newly introduced (AMBER 8.0) Langevin thermostat (NTT=3) to maintain the temperature of our system at 300 K. This temperature control method uses Langevin dynamics with a collision frequency given by GAMMA_LN. This temperature control method is significantly more efficient at equilibrating the system temperature than the Berendsen temperature coupling scheme (NTT=1) that was the recommended method for older versions of AMBER. The biggest problem with the Berendsen method is that the algorithm simply ensures that the kinetic energy is appropriate for the desired temperature; it does nothing to ensure that the temperature is even over all parts of the molecule. This can lead to the phenomenon of hot solvent, cold solute. To avoid this, elaborate temperature scaling techniques for slowly heating the molecule over the course of the simulation were recommended. The Langevin system is much more efficient, however, at equilibrating the temperature and is now the recommended choice in AMBER 8.0. The efficiency is such that if we have a reasonably good structure, which we do in this case, we can actually start the system at 300 K and avoid the need to slowly heat over say 20 ps from 0 K to room temperature. Thus for this simulation we shall set NTT=3 with GAMMA_LN=1. We shall also set the initial and final temperatures to 300 K (TEMPI=300.0, TEMP0=300.0) which will mean our system's temperature should remain around 300 K. In these two examples we will run a total of 100,000 steps each (NSTLIM=100000) with a 1 fs time step (DT=0.001) giving simulation lengths of 100 ps (100,000 x 1fs). The two input files are shown below:

polyAT_vac_md1_12Acut.in	polyAT_vac_md1_nocut.in
10-mer DNA MD in-vacuo, 12 angstrom cut off &cntrl imin = 0, ntb = 0, igb = 0, ntpr = 100, ntwx = 100, ntt = 3, gamma_ln = 1.0, tempi = 300.0, temp0 = 300.0 nstlim = 100000, dt = 0.001, cut = 12.0 /	10-mer DNA MD in-vacuo, infinite cut off &cntrl imin = 0, ntb = 0, igb = 0, ntpr = 100, ntwx = 100, ntt = 3, gamma_ln = 1.0, tempi = 300.0, temp0 = 300.0 nstlim = 100000, dt = 0.001, cut = 999 /

Note: If you are using a version of AMBER prior to version 8 then Langevin temperature regulation will not be available. In this case simply set NTT=1 (Berendsen temperature regulation) and run with this. You will get slightly different results to the simulations shown here but it will be enough for you to get the idea.

So, to run the two jobs we issue the following two commands. Note, if you have a dual (or more) processor machine then you can run both jobs simultaneously, otherwise you should run them sequentially (note we use the restrt file from the minimisation as our starting structure):

$AMBERHOME/exe/sander -O -i polyAT_vac_md1_12Acut.in -o polyAT_vac_md1_12Acut.out -c polyAT_vac_init_min.rst -p polyAT_vac.prmtop -r polyAT_vac_md1_12Acut.rst -x polyAT_vac_md1_12Acut.mdcrd

$AMBERHOME/exe/sander -O -i polyAT_vac_md1_nocut.in -o polyAT_vac_md1_nocut.out -c polyAT_vac_init_min.rst -p polyAT_vac.prmtop -r polyAT_vac_md1_nocut.rst -x polyAT_vac_md1_nocut.mdcrd

These will take considerably longer to run than the minimisation, so it is probably a good time to go off and get a cup of coffee. The 12 angstrom cut off run takes about 900s (15 mins) on a single P4 3GHz machine, if it is taking too long on your machine you can always reduce the simulation length by lowering NSTLIM to say 10,000 (10 ps). You can follow the progress of the job by following the output file with the following command:

tail -f polyAT_vac_md1_12Acut.out

Input files: polyAT_vac_md1_12Acut.in, polyAT_vac_md1_nocut.in, polyAT_vac_init_min.rst, polyAT_vac.prmtop

Output files: polyAT_vac_md1_12Acut.out, polyAT_vac_md1_nocut.out, polyAT_vac_md1_12Acut.rst, polyAT_vac_md1_nocut.rst, polyAT_vac_md1_12Acut.mdcrd, polyAT_vac_md1_nocut.mdcrd

Warning: the mdcrd files are over 15MB in size - you can download gzip compressed (5.3MB) versions here (polyAT_vac_md1_12Acut.mdcrd.gz, polyAT_vac_md1_nocut.mdcrd.gz). You will need to unzip them before use. e.g gunzip polyAT_vac_md1_12Acut.mdcrd.gz.

Note that we now have an additional option, the -x option which specifies the name of the output file for the MD trajectory. This file will contain a snapshot of the entire system's cartesian coordinates every NTWX (100) steps.

Please be aware that the nocut simulation will likely stop after around 22,900 steps. This is not a bug, it is a problem with simulating DNA in vacuo which will become clear when we visualise the trajectory files. More on this later.

3.3) Analysing the results

So, now we've run the simulations what do we want to look at? First off, we may want to look through the mdout files (polyAT_vac_md1_12Acut.out, polyAT_vac_md1_nocut.out). It is also a good idea to extract the various energies as a function of time to plot total, kinetic, potential energies etc. We might also want to calculate RMSd vs. time, etc. To look at movies (which is much more fun :-) ) we can use a visualisation program such as vmd to load up the "parm and crd" (i.e. AMBER prmtop and MD trajectory) and use the animation controls to view it, more on this later.

3.3.1) Extracting the energies, etc. from the mdout file

One can write an awk or sed script to pull out the energies, etc. These can be pulled out of the mden file if energy information was written out (which we didn't do) or directly from the mdout file. If you are at the workshop a perl script (process_mdout.perl), should be installed on your machine for processing mdout files and creating a series of files with the various different pieces of information in them. If you are working on this tutorial on your own machine then you can download the perl script here. The program uses a default output filename so it is best to create a sub directory for each of your output files and move to there before running the script e.g.

mkdir polyAT_vac_md1_12Acut
cd polyAT_vac_md1_12Acut

process_mdout.perl ../polyAT_vac_md1_12Acut.out

If you get an error message here please raise your hand so one of the demonstrators can help you. You should repeat this process for the nocut output file, even though the simulation did not complete all 100,000 steps:

mkdir polyAT_vac_md1_nocut
cd polyAT_vac_md1_nocut
process_mdout.perl ../polyAT_vac_md1_nocut.out

This script will take a whole series of mdout files and will create a whole series, leading off with the prefix "summary." such as "summary.EPTOT", of output files. These files are just columns of the time vs. the value for each of the energy components. Tar archives of the summary files for each of the two MD runs above are available here (polyAT_vac_md1_12Acut.summary.tar.gz, polyAT_vac_md1_nocut.summary.tar.gz).

You can plot the summary files with your favourite graphing program. A useful program for just quickly looking at a plot of a file is xmgr which should be installed on your machine. This is a very simple plotting program, but ideal for our purpose. To plot the total potential energy, of both runs, as a function of time we run, assuming we are in our master simulation directory:

xmgr ./polyAT_vac_md1_12Acut/summary.EPTOT ./polyAT_vac_md1_nocut/summary.EPTOT

This should give you a plot similar to the following:

The black line represents the potential energy for the 12 angstrom cutoff simulation vs. time while the red line represents the potential energy for the simulation without a cutoff. As we can see the 12 angstrom cutoff simulation seems to be fairly stable, the potential energy is fluctuating around a constant mean value. The simulation without a cutoff, however, is significantly different. To begin with the potential energy is some 3,000 kcal/mol higher than the 12 angstrom cutoff simulation. The potential energy is so large in fact that it is actually positive. The potential energy also decreases rapidly over the first 10 ps of simulation suggesting that a large structural change is occurring. The simulation then ends abruptly after 21.9ps with the following error:

 Frac coord min, max:  -5.132892457340335E-005  0.962793346753183     
 The system has extended beyond 
     the extent of the virtual box.
 Restarting sander will recalculate
    a new virtual box with 30 Angstroms
    extra on each side, if there is a
    restart file for this configuration.
 SANDER BOMB in subroutine Routine: map_coords (ew_force.f)
 Atom out of bounds. If a restart has been written,
 restarting should resolve the error

Why the difference in potential energy? Well, by looking at our two output files this can be tracked down to the difference in the electrostatic energy. For the first step we have:

12 angstom cutoff

 NSTEP =        0   TIME(PS) =       0.000  TEMP(K) =   298.54  PRESS =     0.0
 Etot   =     -1726.9350  EKtot   =       565.9592  EPtot      =     -2292.8942
 BOND   =        16.0918  ANGLE   =        93.9850  DIHED      =       392.9956
 1-4 NB =       162.1781  1-4 EEL =      -345.0298  VDWAALS    =      -346.6892
 EELEC  =     -2266.4257  EHBOND  =         0.0000  RESTRAINT  =         0.0000

no cutoff

 NSTEP =        0   TIME(PS) =       0.000  TEMP(K) =   298.54  PRESS =     0.0
 Etot   =      1885.9795  EKtot   =       565.9592  EPtot      =      1320.0203
 BOND   =        16.0918  ANGLE   =        93.9850  DIHED      =       392.9956
 1-4 NB =       162.1781  1-4 EEL =      -345.0298  VDWAALS    =      -349.3700
 EELEC  =      1349.1697  EHBOND  =         0.0000  RESTRAINT  =         0.0000

Notice how the value of EELEC is so much larger in the no cutoff simulation.

So, what has happened and how can we learn more? Well, first of all let's look at another type of analysis we can perform on the results, in this case analysis of the MD trajectory files (polyAT_vac_md1_12Acut.mdcrd, polyAT_vac_md1_nocut.mdcrd). From this we can hopefully understand what is happening.

3.3.2) Calculating the RMSd vs. time

The next step in analysing our results is to calculate the RMSd as a function of time using ptraj (an analysis program provided with AMBER 8). The precise parameter we will be calculating in this example is the mass weighted RMSd fit between each successive structure and the first structure of our trajectory. This is done by providing an input file to ptraj containing a list of commands describing the relevant files and what we want it to do etc.:

trajin polyAT_vac_md1_12Acut.mdcrd
rms first mass out polyAT_vac_md1_12Acut.rms time 0.1

where trajin specifies the name of the trajectory file to process, rms first mass, tells ptraj that we want it to calculate a MASS weighted, RMS fit to the FIRST structure. out specifies the name of the file to write the results to and time 0.1 tells ptraj that each frame of the coordinate file represents 0.1 ps. Since we have two simulations we have two input files, with different trajectory files and output files in each, polyAT_vac_md1_12Acut.calc_rms, polyAT_vac_md1_nocut.calc_rms.

Run the two ptraj jobs with the following commands:

$AMBERHOME/exe/ptraj polyAT_vac.prmtop < polyAT_vac_md1_12Acut.calc_rms

$AMBERHOME/exe/ptraj polyAT_vac.prmtop < polyAT_vac_md1_nocut.calc_rms

The output from each job will look something like this:

...read in prmtop file

Read in control variables
Read in atom names...
Read in charges...
Read in masses...
Read in IAC (atoms involved in L-J)...
Read in NUMEX (index to excl atom list)...
Read in NNO (index for nonbond of @type)...
Read in residue labels...
Read in the residue to atom pointer list...
Read in bond parameters RK and REQ...
Read in angle parameters TK and TEQ...
Read in dihedral parameters PK, PN and PHASE...
Read in SOLTY...
Read in L-J parameters CN1 and CN2...
Read in info for bonds w/ hydrogen...
Read in info for bonds w/out hydrogen...
Read in info for angles w/ hydrogen...
Read in info for angles w/out hydrogen...
Read in info for dihedrals w/ hydrogen...
Read in info for dihedrals w/out hydrogen...
Read in excluded atom list...
Read in h-bond parameters: AG, BG, and HBCUT...
Read in atomic symbols (types)...
Read in tree information...
Read in the JOIN info...
Read in the IROTAT info...
Checking coordinates: polyAT_vac_md1_12Acut.mdcrd

Print summary of the residues:

Amber8 Module: ptraj 

 DA5  DA   DA   DA   DA   DA   DA   DA   DA   DA3 
 DT5  DT   DT   DT   DT   DT   DT   DT   DT   DT3 
Successfully completed readParm.

Process the input file and state what is to be done:

PTRAJ: Processing input file...
       Input is from standard input

PTRAJ: trajin polyAT_vac_md1_12Acut.mdcrd

PTRAJ: rms first mass out polyAT_vac_md1_12Acut.rms time 0.1
Mask [*] represents 638 atoms
FYI: No output trajectory specified (trajout), none will be saved.

PTRAJ: Successfully read the input file.
       Coordinate processing will occur on 1000 frames.
       Summary of I/O and actions follows:

Print out a summary of what the input and output files are and the list of actions to be performed on every coordinate set. In this case, we only have a single input coordinate file and a single action, i.e. rms:

INPUT COORDINATE FILES
  File (polyAT_vac_md1_12Acut.mdcrd) is an AMBER trajectory with 1000 sets

OUTPUT COORDINATE FILE
  NULL entry

ACTIONS
  1>  RMS to first frame using mass weighting
      Dumping RMSd vs. time (with time interval 0.10) to a file named polyAT_vac_md1_12Acut.rms
      Atom selection follows   * (All atoms are selected)

As the program runs it prints out a dot for every frame of the trajectory file that has been processed:

Processing AMBER trajectory file polyAT_vac_md1_12Acut.mdcrd
Set      1 .................................................
Set     50 .................................................
Set    100 .................................................
Set    150 .................................................
Set    200 .................................................
Set    250 .................................................
Set    300 .................................................
Set    350 .................................................
Set    400 .................................................
Set    450 .................................................
Set    500 .................................................
Set    550 .................................................
Set    600 .................................................
Set    650 .................................................
Set    700 .................................................
Set    750 .................................................
Set    800 .................................................
Set    850 .................................................
Set    900 .................................................
Set    950 .................................................
Set   1000 
ERROR in readAmberTrajectory(): Set #1001 is corrupted (

This is not really an error as this comment is printed out at the end of the trajectory file (1000 frames for the 12A cutoff simulation, ~229 frames for the no cutoff simulation).

PTRAJ: Successfully read in 1000 sets and processed 1000 sets.
       Dumping accumulated results (if any)

PTRAJ RMS: dumping RMSd vs time data

We should get two output files, one for each simulation (polyAT_vac_md1_12Acut.rms, polyAT_vac_md1_nocut.rms).

We can now plot the RMSd's vs time. Note, RMSd is in angstroms and time is in ps.

xmgr polyAT_vac_md1_12Acut.rms polyAT_vac_md1_nocut.rms

This should give a plot similar to the following:

And here we see the problem. While the 12 angstrom cutoff simulation has a largely constant RMSd around 2.2 angstroms the no cutoff simulation shows a steadily increasing RMSd which after 20 ps is already over 30 angstroms!!! This would suggest that something has gone wrong with the simulation since our system has essentially "blown up." But, is this the whole story? Or does it mask a larger problem with running a simulation in vacuum. Why does the use of a cutoff here result in a stable trajectory while no cutoff (which should in theory be more accurate) leads to a "blow up."

3.3.3) Visualising the trajectories with VMD

Let's think about this more carefully. Our DNA molecule consists of two chains held together by hydrogen bonds. Each chain has a net charge of -9 electrons. So, our chains have a large electrostatic repulsion as shown by the value of EELEC in the no cutoff simulation (EELEC = 1349.1697). Thus the two chains are repelling each other. For the system to be stable the interaction between the chains, largely due to hydrogen bonding, must counteract this large electrostatic repulsion. Now, in the case where we used a 12 angstrom cutoff, all charges beyond 12 angstroms distance were considered to have zero energy. Since the average distance between the negative charges on the DNA backbone, due to the phosphate groups, is 15 angstroms so the repulsion between the phosphate atoms on opposite chains was ignored when the 12 angstrom cutoff was employed. The electrostatic attraction due to hydrogen bonding between opposite bases is due to a much closer interaction, of the order of 2 angstroms. Thus while the main electrostatic repulsion interaction was excluded when we used the 12 angstrom cutoff, the main attractive force between the two chains was included. Thus during the simulation the chain held together and we obtained a stable trajectory. Let's take a look at it in vmd:

Run vmd (based on vmd 1.8.1):

vmd

Open our 12 angstrom cutoff trajectory file. Select:

File -> New Molecule

The method for opening and handling molecules in VMD has changed since earlier versions. VMD now supports multiple trajectory files for a single molecule. Thus a molecule is defined by loading the associated prmtop file. So, start by selecting Browse and finding the polyAT_vac.prmtop file. Then under the heading where it states "Determine file type:" select parm7. Note: VMD has two definitions for prmtop files, 'parm' and 'parm7'. The 'parm' refers to prmtop files for AMBER versions of 6 and lower. The prmtop format changed slightly between amber 6 and amber 7, largely to allow bigger molecules, so that the parm7 option is required for AMBER v7 and v8 prmtop files.

Then hit 'Load'.

Next we need to choose which trajectory to load, in this case we will load the 12 angstrom cutoff trajectory file polyAT_vac_md1_12Acut.mdcrd (remember to gunzip the file before use). Select browse again, browse for the file and then under 'determine file type' you need to select "crd". Note: if this were a periodic boundary simulation then we would select "crdbox" since the trajectory file would also contain information on the box size. In this situation, however, there is no box information since this is a non-periodic in vacuo simulation. When you hit "Load" again you should see all of the frames loaded into the main molecule window. 1,000 frames in total.

We can now use the playback tools in the "VMD main" control panel to play our movie:

You should have a go at playing the video of the trajectory. Notice how the DNA holds its secondary structure there is considerable movement in the extremities but the overall structure is preserved.

Is this the correct answer though? Just because a trajectory is stable doesn't necessarily mean it is correct. Is a strand of charged DNA in vacuum really likely to be stable? In solvent such electrostatic repulsions are shielded by the solvent but in vacuum there is no such shielding and no external forces to help hold the chains together. Lets take a look at the trajectory obtained from our no cutoff simulation:

If VMD is already open, close it - or if you know how simply delete the existing molecule.

Run vmd (based on vmd 1.8.1):

vmd

Open our no cut off trajectory file. Select:

File -> New Molecule

Repeat the process as described above. Use the same polyAT_vac.prmtop file but this time select the no cut off trajectory file polyAT_vac_md1_nocut.mdcrd. This time when you hit "Load" VMD should load the trajectory up until the point where the simulation crashed (229 frames of 0.1ps each = 22.9 ps). Have a look at the trajectory, the difference from the last simulation should be obvious. The instability of the DNA dimer is clear.

0 ps	2.5 ps	5 ps
7.5 ps	10 ps	12.5 ps
15 ps	17.5 ps	20 ps
22.5 ps

If we view the DNA dimer from the top during the movie of the MD trajectory it is slightly easier to see what is happening, the large repulsive charge on the two chains is causing them to uncoil and move away from each other:

0 ps	2.5 ps	5 ps
7.5 ps	10 ps	12.5 ps
15 ps	17.5 ps	20 ps
22.5 ps

Here we see that the two chains have, due to the large electrostatic repulsion forces, rapidly separated from each other and have started to drift apart. The simulation stopped when the distance between the two strands exceeded pre-determined parameters within the code.

So, which simulation is the correct one. Well, since this simulation was in vacuo and we had no neutralising ions the conditions did not really represent laboratory conditions. Indeed in this "harsh" environment, with no clustered water or ions, it is likely that the DNA 10-mer is going to be unstable and so the behaviour shown by our no cut off simulation is most likely the closest to reality.

The take home lesson here is that you should think very carefully about what you are simulating, Are you really simulating realistic conditions, how are the parameters you have chosen biasing your results? A cutoff can be a good way to increase the speed of a simulation, but you need to be aware that it can introduce very large artefacts into your simulation. So, think very carefully, and try out several scenarios before you try to reach firm conclusions.

One way to improve considerably on our in vacuo simulations is to make our physical model of DNA much closer to reality, i.e. include explicit neutralising ions and also to include solvent effects, either implicitly within our model or via the use of explicit solvent. This is the subject of the next two stages of our tutorial here.

CLICK HERE TO GO TO SECTION 4