Você está na página 1de 10

MOLECULAR DYNAMICS simulations of an ANDROGEN RECEPTOR using GROMACS

System: HUMAN ANROGEN RECEPTOR

Synopsis: In this tutorial you will perform Molecular Dynamics simulations of the human androgen receptor (PDB entry code 2AM9), solvated in a box of explicit water molecules. Note that today we will work with the receptor without any ligand bound. You will learn how to set up the system and the required files, as well as run the simulation. Finally, a series of analysis tools will be discussed and applied to the trajectories. We will use the simulation package GROMACS.

GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers Free download and manual: http://www.gromacs.org

Notes: - This document gives the command lines (in bold and framed) for the different steps required to setup and run a molecular dynamics simulation, while describing briefly the procedure. - Some questions are asked along the document (in bold and italics) to guide the understanding of what you are doing. - Structure (*.pdb) and trajectory (*.trr) files will be generated during the practical. Use Chimera to visualize them. - For any of the utility modules in gromacs type name_module h for a description of options. - In order to be able to run the simulations during the session, the simulation times are very short and not reliable for real studies. At the end of the session you will be provided with a more extended simulation to be used in the Analysis Part of the tutorial.

1. SETTING UP THE SYSTEM AND FILES


1.1 Pre-processing of the structure. The starting structure file is called 2am9_preproc.pdb. It corresponds to the 2AM9 Protein Data Bank entry code but it does not contain the ligand. Duplicated residue sidechains have also been removed. 1.2 Processing the structure, generating a topology file and choosing the force field. We will use the pdb2gmx command (type pdb2gmx h to see the command line options) to convert the pdb file we have pre-processed (2am9_preproc.pdb) to a pdb file ready for Gromacs to use it and a topology file. The f flag reads the pdb file; -o flag is for the output structure file, -p for the output topology file, and nomerge indicates that if more than one chain is present in the pdb file, they must not be merged. Select the GROMOS96 53a6 force field when prompted (option 4).

pdb2gmx -f 2am9_preproc.pdb -o 2am9_gmx.pdb -p system.top -nomerge

Now we have a structure in the correct format for the chosen force field, and the corresponding topology file. Visualize the structure file generated with Chimera to detect possible errors. Read the topology file and identify the data it contains. How many atoms does the protein have?

1.2 Solvate with explicit waters. Now, the environment conditions will be defined and the system solvated. First of all, the simulation space has to be specified. We will use an orthogonal box. The dimensions of the box will be so that a minimal distance of 1.0 nm (10 ) exists between the edge of the box and the protein.

editconf -f 2am9_gmx.pdb -o definebox.pdb -d 1.0 -princ

The molecule will be oriented along its principal axis (-princ) Now, the solvent water molecules are added (by default the $GMXLIB/spc216.gro water coordinates are used). There are several solvent models; the GROMOS96 force fields are generally used with the simple point charge (SPC) water model. The topology is not required for solvent addition, but may be updated to include the new water molecules.
genbox -cp definebox.pdb -cs -p system.top -o system_box.pdb

How many atoms does the system have now? How many solvent water molecules? Has the topology file changed? Where are the solvent molecules specified? What is the total charge of the system? Visualize the solvated system in Chimera.

1.3 Energy Minimization. Some energy minimization steps are run for this structure in order to relax possible clashes existent in the crystal structure or generated when adding the hydrogen atoms and the solvent molecules. To run a simulation with Gromacs (energy minimization or molecular dynamics) two steps are necessary: a) First, structure and topology are combined into a single description of the system, together with a number of control parameters (minimization.mdp, file containing the options of the simulation we want to run). This is done with grompp.
grompp -v -f minimization.mdp -c system_box.pdb -p system.top -o minimization.tpr

Have a look at the contents of the minimization.mdp file. Note the integrator chosen. These would not be the most adequate control parameters if we wanted to fully minimize the system. But as we just pretend to relax strains, it is enough. Notice the smooth convergence criteria used. Also, for the non-bonding interactions, cut-offs of ~14 (1.4 nm) are recommended for the force field used here. However, larger cut-offs mean longer computational times and, for the present practical, they are set to 0.9 nm. b) The input file generated in the last step can now be used, alone, as input file for the run.

mdrun -v -s minimization.tpr -o minimization.trr -c minimization.pdb -e minimization.edr

Which method was used for energy minimization? How many steps were specified and how many steps did it take? What was allowed to freely move during the minimization? Visualize the minimized structure in Chimera and compare it to the previous structure.

; MINIMIZATION.MDP ; Example of energy minimization options in GROMACS ; Everything following ';' is a comment title = Energy Minimization with PME ; Title of run

; The following line tell the program the standard locations where to find certain files cpp = /usr/bin/cpp ; Preprocessor ("which cpp" to find it in your machine) ; Define can be used to control processes define = -DFLEXIBLE constraints = none ; Bond types to replace by constraints ; Parameters describing what to do, when to stop and what to save integrator = steep ; Algorithm (steep = steepest descent minimization) emtol = 1000.0 ; Stop mini when the maximum force < 1000.0 kJ/mol!! emstep = 0.01 ; Initial step size (in nm). nsteps = 500 ; Maximum number of (minimization) steps to perform nstenergy = 1 ; Write energies to disk every nstenergy steps energygrps = System ; Which energy group(s) to write to disk ; Parameters describing how calculate the interactions ns_type = grid rlist = 0.9 coulombtype = PME rcoulomb = 0.9 rvdw = 0.9 fourierspacing = 0.12 optimize_fft = yes to find the neighbors of each atom and how to ; ; ; ; ; Method to determine neighbor list (simple, grid) Cut-off distance for short-range neighbor list Treatment of long range electrostatic interactions long range electr. cut-off, desirable value 1.4 nm long range Van der Waals cut-off, desirable 1.4 nm ; Parameters related to PME ; Parameters realted to PME

1.4 Neutralizing the system. In order to neutralize the system, some counterions will be added to the box (Cl- if the system has a positive charge and Na+ if the system has a negative total charge). One could also want to add ions up to a certain concentration. We will replace water molecules by the number of ions required to neutralize the system. The program will ask you to specify the group of solvent molecules from which extract the waters to be replaced (select SOL, option 12). Again, we need to combine the structure and the topology files:

grompp -v -f minimization.mdp -c minimization.pdb -p system.top -o add_ions.tpr

genion -s add_ions.tpr -o added_ions.pdb -nname CL- -nn write_the_number -g added_ions.log

IMPORTANT NOTE: The topology file has not been updated automatically after the replacement of the water molecules with chloride ions. This will have to be done manually. The atom/molecules nomenclature can change from one force field to another (see ions.itp file)
Edit the topology file and decrease the number of water molecules in the second SOL segment by the number of ions added. Also add a line specifying the number of chloride ions added (CL-) Check that the #include ions.itp line is present in the topology file.

Now the entire solvated system is again submitted to some relaxation steps. As before, first use grommp to combine structure, topology and controls, and then launch mdrun to actually run the minimization job.
grompp -v -f minimization.mdp -c added_ions.pdb -p system.top -o minimization2.tpr

mdrun -v -deffnm minimization2 -c minimization2.pdb

1.5 Equilibration of solvent water. The last step before starting the molecular dynamics simulation is already a molecular dynamics simulation, but with positional restrains. We will be applying restrains to the position of the protein atoms (define = -DPOSRES), but allowing the water molecules to move freely. The positional restrains will be read from the posre.itp file that was created by default with the pdb2gmx command. By this, the waters added accommodate around the protein. Moreover, we will apply the Linear Constrain algorithm (LINCS, constraints = all-bonds) for fixing all bond lengths in the system (important to use this for dt > 1fs). This is usually used to save computational time. NOTE: Here we will be running only 5ps of restrained MD. In a real simulation the equilibration of the water would run for 20-200ps long.

grompp -f MD_pr.mdp -c minimization2.pdb -p system.top -o md_pr.tpr -maxwarn 5

You could try running in parallel: (here 2 processors): mpirun np 2 mdrun s . (the job takes around 3 minutes in serial) mdrun -s md_pr.tpr -o md_pr.trr -c md_pr.pdb -g md_pr.log -e md_pr.edr &

; MD_pr.mdp ; Example of position-restained MD in GROMACS ; Everything following ';' is a comment title warnings cpp = ANDROGEN RECEPTOR with water = 10 = /usr/bin/cpp

; Apply restrains define = -DPOSRES constraints ; Run integr dt nsteps nstxout nstlist ns_type rlist coulombtype rcoulomb rvdw fourierspacing fourier_nx fourier_ny fourier_nz pme_order ewald_rtol optimize_fft

; read force constants from posre.itp and apply ; positional restrains to protein atoms = all-bonds ; LINCs to all bonds

= md ; Molecular Dynamics run = 0.002 ; integration step, in ps = 2500 ; total number of steps: 5 ps (in real, at least 20ps) = 250 = = = = = = = = = = = = = 5 grid 0.9 PME 0.9 0.9 0.12 0 0 0 4 1e-5 yes ; save coordinates every 0.5 ps (in .trr) ; frequency for updating the non-bonding list

; Particle Mesh Ewald

; Berendsen temperature coupling is on in two groups Tcoupl = berendsen ; thermostat type tau_t = 0.1 0.1 0.1 ; time constant for the T coupling (in ps) ; one value per tc-group (same order) tc-grps = protein SOL CL; the groups are listed ref_t = 300 300 300 ; reference temperature, i.e. T of the MD ; Pressure coupling is on Pcoupl = berendsen tau_p = 0.5 compressibility = 4.5e-5 ref_p = 1.0

; ; ; ;

barostat to control the simulation pressure time constant for coupling (in ps) value for water at 300 K and 1 atm reference pressure for the coupling (in bar)

; Generate velocites is on at 300 K. gen_vel = yes gen_temp = 300.0 gen_seed = 8378922 ; seed number for the random generation of velocities

To which atoms are the positional restrains applied? How many steps would be required to simulate 100 ps?

2. RUNNING THE MD SIMULATIONS


Once the solvent has equilibrated, we can start the molecular dynamics simulation for the whole system, that is, all the system is allowed to move. Notice that constraints are still applied to all bonds. This is usually used, at least for the bonds involving hydrogen atoms, as it permits using longer integration steps and, thus, it saves computational time.
; MD.mdp ; Example of MD in GROMACS. Bonds are constrained with LINCS. title = ANDROGEN RECEPTOR with water cpp = /usr/bin/cpp ; Apply restrains constraints = all-bonds ; LINCs to all bonds ; Run integr dt nsteps

= md = 0.002 = 10000 ; the = 500 = = = = = = = = = = = = = 10 grid 0.9 PME 0.9 0.9 0.12 0 0 0 4 1e-5 yes

; Molecular Dynamics run ; integration step, in ps ; total number of steps: 20 ps (in real, it depends on application, from 500ps to several ns to hundreds of ns) ; save coordinates every 500 steps (in .trr) ; frequency for updating the non-bonding list

nstxout nstlist ns_type rlist coulombtype rcoulomb rvdw fourierspacing fourier_nx fourier_ny fourier_nz pme_order ewald_rtol optimize_fft

; Particle Mesh Ewald

; Berendsen temperature coupling is on in two groups Tcoupl = berendsen ; thermostat type tau_t = 0.1 0.1 0.1 ; time constant for the T coupling (in ps) ; one value per tc-group (same order) tc-grps = protein SOL CL; the groups are listed ref_t = 300 300 300 ; reference temperature, i.e. T of the MD ; Pressure coupling is on Pcoupl = berendsen tau_p = 0.5 compressibility = 4.5e-5 ref_p = 1.0

; ; ; ;

barostat to control the simulation pressure time constant for coupling (in ps) value for water at 300 K and 1 atm reference pressure for the coupling (in bar)

; Generate velocites is on at 300 K. gen_vel = yes gen_temp = 300.0 gen_seed = 392811 ; seed number for the random generation of velocities

As before, we do the two steps required:

grompp -f MD.mdp -c md_pr.pdb -p system.top -o md1.tpr

You could try running in parallel: (here 2 processors): mpirun np 2 mdrun s . (the job takes ~24 minutes in serial)
mdrun -s md1.tpr -o md1.trr -c md1.pdb -g md1.log -e md1.edr &

What is the simulation time? How often will the structure be saved? What are the simulated Temperature and Pressure? Usually, a molecular dynamics simulation includes three parts: heating, equilibration and production run. How is this achieved here?

To make the simulation longer you can run another bit by simply doing:

grompp -f MD.mdp -c md1.pdb -p system.top -o md2.tpr

mdrun -s md2.tpr -o md2.trr -c md2.pdb -g md2.log -e md2.edr &

3. ANALYSIS OF THE MD SIMULATIONS


The first thing to do after running a molecular dynamics simulation is to check that the simulation worked properly and that the simulation conditions were achieved. To this aim, you will start by monitoring the T and P, studying the structural equilibration through calculation of the RMSD (root mean square deviation), analyzing the behavior of the different energy terms (total energy, potential energy and kinetic energy), etc. Notice that some of these tasks can be done with the GROMACS utilities as well as with Chimera. Here you have some useful analysis tools from GROMACS. Use them (and other if required) to analyze the trajectory.

# Examples of energy analysis: the energy term to analyze is selected in a menu g_energy -f md1.edr -s md1.tpr -o T_md1.xvg g_energy -f md1.edr -s md1.tpr -o Etot_md1.xvg # Concatenate several trajectory files: echo c c | trjcat -f md1.trr md2.trr -o full_md.trr -cat -settime

# trjconv is used to manipulate trajectory files: change format, extract the trajectory for one group # align a trajectory to a reference structure trjconv -f md1.trr -o md1_fit.trr -s md_pr.pdb -fit rot+trans # get one snapshot out of the trajectory file . Here gets pdb at 340 ps trjconv -f md1.trr -s md1.tpr -o MD_ 340.pdb -b 340 -e 341 # RMSD calculation: g_rms -f md1.trr -o rmsd_md1.xgv -s md_pr.pdb -fit rot+trans

# Geometrical analysis: (monitoring a distance, a dihedral angle) g_angle -f md1_fit.trr -n dihedre.ndx -type dihedral -of dihfrac.xvg -oc dihcorr.xvg -oh trhisto.xvg -ot dihtrans.xvg -od angdist.xvg # Structural clustering: g_cluster -f md1_fit.trr -s md_pr.pdb -cl cluster_12.pdb -g cluster_12.log -dist rmsd_dist_12.xvg -cutoff 0.12 -noav -method gromos -o rmsd-clust_12.xpm -sz clust-size_12.xvg -clid clust-id_12.xvg -ev rmsd-eig_12.xvg -n index_clustering.ndx

You have seen how a lot of analysis tools in GROMACS allow the selection of parts of the structure in order to perform the analysis for only these parts. Often a menu allows the user to easily choose the selection (Calpha, protein, protein-H, backbone, sidechains...). However, sometimes we may want just a selection of residues, one chain, 'selection within 10.0 of ligand', a dihedral angle... In those cases, a index.ndx file can be used, in which the atom numbers of those atoms selected for the analysis are specified.

For example, a dihedre.ndx file for selecting dihedral angles would look like:
[SER115] 1139113811371135 [SER116] 1147114611451143 [SER118] 117211711170116

There is a program called make_ndx which facilitates preparation of this index.ndx file. Example of usage: make_ndx -f md1.pdb -o index.ndx

A menu permits selection of the 'group' you want to include in the index.ndx output file. The output can contain several 'groups'. When the index.ndx file will be read by an analysis tool, the latter will ask for the group you are interested on from the ones included in the index.ndx file. You can edit the index.ndx file and add the selection of atoms you want. Notice that the required items are a title for the group and the list of the atom numbers included in the group.

Você também pode gostar