Study of A Highly Accurate and Fast Protein-Ligand Docking Algorithm

I
Study of a llighly Accurate and Fast Protein-Ligand Docking Algorithm Basedon Molecular Dynamics
A.A. chien l, C.L. BrooksIII2'3 D. M. M. Tauferl,2,3, Crowley2, Price2, Dept. of ComputerScienceand Engineering 2 Dept. of Molecular Biology (TPC6) Institute University of California at SanDiego The ScrippsResearch U.S. La Jolla,California92037,U.S. La Jolla,Califomia92093, achien@ucsd.edu taufer,crowley,priced,btooks@scripps.edu 3 Centerfor TheorcticalBiological Physics La Jolla.Califomia92093.U.S.
Abstract
Few melhodsusemolecular dynamicssimulations based on atomically detqiledforce fields to study the prcteinligand dockingprccess becaasethelt are consideredtoo time demanding despite ,heir accuracy. In this paper we present a docking algorithn basedon molecular dynauics simulations which ha a highly fexible computqtionql granulafity, Wecomparc the accuracy ond the time requircd with well-known, conmonly rsed doching methods like AutoDock, DOCK, FlqX, ICM, and GOLD. We show that our algorithm is accurate, fast and, becouseof itsflexibility, applicable even to loosely coupled distributed systemslikc deskap grids for docking. Ket'rtrords: Forcefeld based methods,docking accuracy, desbtopgrid computing.
plexesare time and rssourceinttrsive.As a result, much research effort hasbeenfocusedon computationalmethods for the prediction offtis dimcuh-to-obtain structural information. In general,this processis called docking. Current docking algorithms typically use a fast, simplified scoring function to direct the conformational search and selectthe best skuctures. However,recentv{ork has demonstratedthat there are significant inaccuraciesassociatedvrith these algorithms ul. Fudhermore, there are indications that inaccuraciescan be reducedby using algorithms that use more sophisticatedphysics. For example, CDOCKER, a docking algori&m based on molecular dynamics(MD) and a conventionalmolecular mechanicsforce field, indeed provides better accuracy than other methods. However,it is still amougthc more compute-intensivemethods. Our aim is to adapt the CDOCKER method to improve its performance without sacrificing the acculacy. In pafiicular, v/e would like to take adyantageof advances computertechnologies in and the establishmentof new disaibuted architcctures. such as desktopgrids. Desktop grids provide a viable aud inexpensive solution to the hiiherto uncompetitive computational cost and time for the force field methods. New algorithms with finer computational granularity need to be deyeloped, especially algorithns more suitable for the highly volatile nodesof desktopgrids. In this paper lve prosent an algorithm for the docking processbasedon MD simulations as in [2], but characterizedby a higher floxibility that makosit adaptablcto any computing platform, even
1 Introduction
A vast number of the essentialroles that proteins play require small molecules to bind to speaific spots in the prolein struchLrre. instance,the small molecules can For act as switches to tum on or o1f a protein function, or are tlle substates for the particular chemical reaction that a protin enzym catalyzs. Obtaining the atomic level details ofthe protein-ligand interactions is a valuable tool in the developmentof novel pharmaceuticals. Conventionalexperimentaltoch[iques for obtaining detailed structural iDformation about proteitr-ligand com-
to very challenging desktopgrids. Docking many ligaDdsto tlle sameprotein followed by scoring them for their relative strengthof inkraction has been proposedas a procedureto identiff candidatesfor of drug development. Screoninglarge databases compounds in this manner can potentially provide an alternative to conventionalhigh-throughput screening,but it is not cost-efectivc unless the docking algorithm is fast and accurate, dockingmethMost ofthe well-known,commonlyused ods that are not based on MD were compared and analyzed in detail iD [3]. We validate the accuracy of our algorithm by appllng the tests defined in [3] to our method, and compare to the published rcsults for AutoDock [4], DOCK [5], the following othermethods: Flexx [6], ICM [7], and GOLD [8]. We showthat our algorithm is indeed more accuratethan all other methods except ICM. We reach a docking successrate of over 70olo, confirming the accuracyreported in [2]. The time required for running a completdocking attemptis longer but comparablewith the time of the other methods. The fne computationalgranularity ofour algorithm is trivially parallel and each simulation attempt is decomposableinto independentsub-jobs. This flexibility makesour accuratedockhg simulations fast wheDthere are rnany independentcompute nodes, and thus, applicable to a wide range ofplatforms from traditional supercomputersto loosely coupleddishibuted systemslike desktopgrids. In Section 2 lve present our docking protocol as woll as some well-known and commonly used docking algorithms that we will compate with our method. In Section 3 we define the motrics of accuracy and time that we use to validate our method while in Sectiotr 4 we compareour method with the other docking algorithms based on those menics. Finally, in Section 5, $'e discussthe applicability ofour docking protocol to deskop grids, and future work comfug out oftlis research.
in which the NeMonian equations of motion are discretized and solved numerically by an inicgration procedure such as the Vcrlet algorithm. The force on the atoms is the negativegndient ofthe CHARMM potential energytunction u ll. 2.2 ModelingProtein-Ligandlnteractions
The MD-based Docking Algorithm
2.1 The CHARIVIM Scientific Conputation Code We use CHAFMM to perfom MD simulations and investigatethe protein-liganddocking process. for biologicallyrelCHARMMis a program simulating (proteins, evant macromolecules DNA, RNA) andcomplexesthereof [9]. It allows the invcstigation the of of or structure dynamics largenoleculesin solvent aDd to free crystals. CHARMMcanbeused calculate energy diferonces uponmutations ligandbinding[0]. One or applications CHARMMis MD, of themostoommon of
Advances in energy calculation techniquesmake it viable to use a gid-based reprcsentationof th proteinligaDd potential intemctions to calculate our sconng function. A grid potential allolvs us to rcpresenta rigid protein binding site as a potential field magnitude at each grid poirt. Protein interactions with the ligand in the binding site are interpolated from the interaction strengthof the gdd points near eachatom of the ligand, ratler than ftom computing thc intenctions ofall ligand atoms with all proteiD atoms individually, resulting in ordeB of magnitude fewar floating-point computation$ than in the tnditional molecular mechanicsmethod. In a prcliminary phase of the docking simulations, we oalculatethree dimosionalgrid maps for eachof the 20 potential atom typs composingthe ligands uoder invesof tigation.Eachgrid map oonsists a threedimensional lattice of regularly spaoodpoints surrounding and centeied on the active site ofa prolein. Eaahpoint within the grid map storcs the potential energy of a 'probe' atom due to its interaction with the macromolecule. For exwith ample, in a carbon grid map, the value associated a gdd point representsthe potential energy of a carbon atom at that location due to its intoractionswith all Etoms of the protein reccptor, We have chosena grid spacing of lA based on previous work that showed no significant di{ferences in docking accuracy for grid spacings between 0.25Aand I A [2]. To facilitate the penehationof small ligands into the protein sites and allow larger oonfigurational changes,van der Waals (vdW) and electrostatic potentials with soft core repulsions [ 2] were utilized iNtead of the tradi the tional potentials.A soft core repulsionreduces potential barier at vanishing interatomic distancesto a finite limit. In this case,ligands can passbetweenconformational minima with a relatively small potential barder that would normally be infinite and impassible with an unmodifedpotential. 23 MD Docking Protocol
As in most of the existing methods, we model the protein-ligand complex as composedof a rigid prctein structure atrd a flexible ligand. A flexible ligand has three translational degreesof fteedom, three rotational degreesof freedom and one dihedral rotation for each
rotable bond. The docking search is computed oYer a 6 + n dimensionalspacewhere n is the numberofrctable bonds in the ligand. Figure I showsthe MD-based algorithm used for our docking simulations. One loop constitutes a docking trial. Given a protein and a ligand protein-ligaDd to dock into its binding site (a so-called ofn consists sequence ofa a complex), dockingattempt independenttrials. For each trial, a random configuration for the ligand is generatedby running 1000 steps of MD at the constalt temperature of 1000K in vacuum, stafiilg from a reasonableshucture with random initial velocities on each ligand atom. We have analyzed the distribution of torsional angles generatedby this method, and found that they indeed vary randornly over the physically reasonablerange for each rotable bond. We are confident that our ilitial configurations randomly samplethe available configurational spaceof the ligand. Starting from the new ligand configuration, a set of 10 ditrerent orientations are randomly generatedand docked into the receptor, that is, moved into tho center of the grid" Once the randomized ligand has boen docked into the active protein site, we run a MD simulation consisting ofa heatingphasefrom 300K to 700K, followed by a cooling phaseback to 300K. Finally, we refine the simulation result by running a short energy minimization. In the end, we use the energy of bitlding as the scoring function to rank the docked ligands and retum the lowest energy structure as the solution to the docking trial. Iwenty trials were run for each complex to ascertaintie optimal number oftrials that should coDstitute an attempt at docking. 2.4 The Other Docking Methods
Figure t. Our llD-based prteinligand algorithm,
docking
Coomon search teohniques for predictitrg binding affinities and geometries are based on genetic algo.ithms, chemistry, geometry of atoms, Monte Carlo or MD. Selection of best docked shucturcs is performed using scoring functions belonging to three different categories; explicit force field scoring furctions (as in our case), empirical scoring functions, or knowledgebased scoring functions. AutoDock, DOCK, FlexX, ICM and GOLD are well-known, commonly used programswhich usea variety ofsearch methodsand scoring the functions to address study ofproteinligand docking. AutoDock [4] uses the Lamarckian genetic algorithm (LGA) by altemating local search with selection and crossover The ligands are ranked using an energybased scoring function and, to speedup the score calculation, a grid-basedprotein-ligand interaction is used. GOLD [8], like AutoDock, deploys a gnetic algorithm and usesa scoring function which is the sum of enelgy
terms, some of which reflect the short-rangevdW interaction betweenprotein and ligand as well as the ligald intemal eoorgy. The searchin DOCK [5] is driven by the geometry ofthe tigand in the active site. Different scoring firnctions can be employed: (l) geometric alignment and shapccooshaints,(2) the elechostatic potential of the protein-ligand complex using tlle program DELPHI, or (3) the etrergy of the protein-ligand complex under the AMBER force field. FlerX [6] is also driven by the geometry of the ligand in the active site like DOCK. Ia Flexx, the scoring uses a vadation of the Bdhm scoring function with trtrls for several kitrds of intenctions atrd penalty functions for the deviations from ideal interaction geonetries. ICM [7] uses a Monte Carlo minimization on the internal coordinates to find the global minimum ofthe scoring function. The scorhg function usedto rank placemeutsof ligands relative to one aootler takes into account the force feld energy of the ligand and the prctein-ligand interaction energy.
3 Metrics 3.1 Accuracy

The accuracyof any given docking attempt is measured (RMSD) of all nonby the root-mean-square-deviation hydrogenligand atomsbetweetlth lowest-energystluctule ftom tho docking attempt and the ligand's position in the c4/stal sfucture. For many ofthe ligands studied here, d dihedral rotation canresult in a ligand conformation that is geometrically and chemically indistinguishable, but nith a different RMSD relativo to the experi-
F
Protein llypsln Cytochrcme P450.a, Neuraminidase Ca$oxypeptidase L-Arabinose e-Thrombin Thermolysin Penicillopepsin Intestinal FABP Ca$onic Anhydruse ll Proiein-LigardComptexPDB Etrrry j p rb(3).rhg(2),l l nj (3).l 1nk(4),rI(5). trpp(7). l tpph(l l ) lph(l), lphs(s), 2cpp(3) lnsc(I2),lnsd(l l), lnnb(l 1) l cbx(s),3cpa(8), 6cpa(16) I abe(a), I abf(s), 5abp(6) let(I5), let(l 3), let(l l) 3trnn(I 0), 5tln( I 4), 6tnn(20) lapt(30), lapu(29) 2ifb(l5) lcil(6), I okl(5), lcnx(I3)
Table1. Dataset ofthe 31 protein-ligandcomplexesused tor our erperiments.Thenumber of rotable bonds for each ligand is reported beside the complex name.
mentally determined structure. That is, the RMSD between a docking attempt alrd the crystal structureusing a one-to-onemapping of atoms may or may not accurately measurethe quality ofthe docking attempt. Consequently,we have exhaustively calculated the RMSD of all degenerateconformers related by the rotation of all symmetry-confedng dihedral angles. The lowest RMSD obtainedfrom this searchis guaranteed be the to corect RMSD for the structure. Reference[3] provides an additional measuredescribing the ftequelcies wherehigh-quatity docking solutionsare found. For many docking attempts, the dockiug accuracy (DA) can be defined as follows: D'4 - /R:rD<2+ 0.5(./ayso<s ./nvso<t) (l)
4 4.1
Simulation
Results
TestbedCharacterization
All the docking simulations for the methodspresented in Section wereperformed an SGI Rl00it0 equipped 2.4 on with a single I 95 MHz IP2 processorand I 28MB memory. We use the samemachine for the measurement and oomparisonof the time required for completing a single attempt. For the investigation of the accuracyand, in particuler, for the iovestigatior ofthe optimal number oftrials per attemptsto reachan acceptableDA, we run our severalsimulatiom on a cluster of64 dual-orocessor nodesat the San Diego Supercomputer Center(SDSC) at UCSD equipped with 930MHz pentium III Drocessors. 4.2 Characterizrtion tions of our Docking Simula-
rvhere/i rrso<a lhe tiaction of docking auemptslhat is producestructureswith an RMSD relative to tie exDerimental structureof4 Angstoms. 3.2 Computrtional Time
In order to comparethe performanceofour docking algorithm with the other methodsand study its applicability to different compute platforms, we look at the CpU time required for completing a docking attempton a single node. In the case of the docking methods reported in [3], the length of a docking attempt was conholled by the default or recommended parametersettingsofthe specific docking algori0rm. For our MD-based method, we considerthe time to complote a set of docking trials and report CPU time for setsof l, l0 and 20 trials.
We {ull our docking simulafions on a data set of 3l prctoin-ligand complexes, all of the complexes used iD [3] that are prcsent in the Ligand fuoiein DataBase (LPDB) [13]. The criteria for choosing the proteinligand complexesin [3] arethat the protins under investigation have at least two entries with different lisands in the PDB (wirh the exception ofthe Intesrinal FABP), and tbat no prolein-ligand covalentboDdsare Dresent. Table I showsthe list ofthe ten proreins and their ligands. The ligands have different sauctursand numbers of rotablo bonds, ranging from I to 30. The number of ligand rotable bonds is reported next to eachcomplex in the table. We considerfour differentcases eachwith a different number of MD steps for the heating and cool-
ing phases.Table2 showsthe four casesand the associated numbei of I fsec MD steps.Figure 2 showsthe DA of our MD-basedmethodwith different number oftrials per attempt and diferent lengihs for the MD simulation (each caseis reported in Tabte2). In the figure we label each attempt with li where i is the number of itrdependent trials per protein-ligand docking attempt (i ranges ftom one to twenty). By looking at the data reported in Figure 2, we conclude that rryeneed about 10 trials per atiempt to rcach a docking accuracyof70%. Case CaseA - lK2.5K CaseB - 2K5K CaseC - 4Kl0K CaseD - 8K20K lleatingPhase CoolnrgPhase # MD sleps # MD steps 2500 000
bei of MD stepsdudng the heating and cooling phases causesan almost linear increaseof the simulation time.
2000 4000 8000
5000 10000 20000

Ar I CAsc lK:.sK B: I Casc ]K5K C: I Case 4Kl0K Casc ltKl0K D: F
Table2. The four different lilD simulations, each with a different number ofMD stepsforthe heating and cooling phases.
Figuro 3. Ayerage time in seconds pel trial and with different number ot ltlD steps Per simula-
tion.
We have run several experiment! with l0 and 20 trials and haye confirmed that tie results shown in Figure 2 and Figure 3 are repeatable(data not shown). For our comparison in tie rest of the paper we use CdseB as a referelce case for which we run 2000 MD stcps dudng the heating phase,and 5000 MD stepsduring the cooling phase. Each MD step coDsistsof I fsec time step. The DA, RMSD and time values for the other mothods in reponedin Section2.4 and used in our comparisons the rest ofthis paper are from the previous work of our group [3]. F F F F F F F F F -F -F F F -F -F T number oftrials -*- Case lK2.5K A: --.- CaseB: 2K5K --r- CaseC:4K10K -rCaseD: 8K20K 43 Comparison ofthe Docking Accuracy (DA)
Figure 2. Docking Accuracy (DA) for difiercnt number of trials per docking attempt and with dlfferent number of llD steps per simulation.
per time in seconds trial and Figurc 3 showsthe average with diferent number of MD stepsper simulation as reporhd in Tabte 2. As fipctd, the increaseof num-
Figue 4 comparesthe DA ofthe well-knom methods with the DA of our MD-based method for Case B in which eachMD simulation consistsof2000 heating MD stepsand 5000 cooling MD steps. By looking at the data reported in Figure 4, we obscrve that our method provides better DA than atl the other methods, oxcept ICM. ICM employs an algorithm which improves convergenceby usiug an analytical gradient minimizer and running multiple Monte Carlo minimizations fiom several starting configurations. We plan to make a morc detailed study of MD and Monte Carlo simdlations for the docking processin the near future.
Cornpler # rotablebonds AuloDock Proteirr-ligand lptb I tog I tnj I tnk Itni Itpp I pph I phf I phg 2cpp lnsc lnsd I nnb t cbx 3cpa 6cpa I abe I abf 5abp I etr I ets 1tt 3tmn 5tln 6tlnn l apt I apu 2iIb lcil I okl I cnx
DOCK FlexX
0.59
ICM
GOLD
l.{J9
0
0.56
t:0
0.54
3 2
.l 4 5 '7 ll
I
5 3 l2 ll
5
It
t6 4 5
6 l5
0.80 0.62 t.2l t.69 2.61 LttO 5.t4 2.09 3.s2 i.40 1.40 I .20 0.92 t.3i 2.22 8.10 0.16 0.48 0.48
4.61 5.06 fr_ l2 .+.51 5.i,1 ta.'72 I _89 9.10 3.09 5.{. I 8.54 10.9
0.1t6
1.56 l_1J7 5.26 3,15
0.49 1.08 0.7 | I _7i 2.1'7 t.' 10 2.53
2.13 3.40
1.95 i.:7 4.68 4.8'7 l .7l
t.n9 t.90 3.01{ 4.93 4.2i 4.12 4.20 3.49 |.02 0.96 0.84 l .tt7 1.87 4.96 0.1tt 0.50 0.59 5.99 2.39 1.30 3_96 1.60
0.70 0.69 1.42 1.5 0 t.r6 l .14

, 'r'l
3.91
1.39 5.5'7 2.4r\ 4.86 4.51 4.51 3.l 3 6.48 lt.30 | .87
| ..t4
l .2l
0.46 o.44 2.53 6.00 l_tio t.56 t.04 0.92 L09

0.ti2 0.77 9.tt3 1.60 | .32 l .5l
0.55 0.36
1.26 0.87 (r.-ll t.l I 6.24 0.99 5.i 0 l.-16
3.2s 0.76 0.61 3.trg 4.68 0.8tt

6.66 3.9i
l3 l0
l4
1.33
1.O9
t0 30 29
l5 5 l3
t.39 '7.1t\
8.06
6.33 t.42
4.51 5_95 8.43 8.94 3.52 1.60 0.88
7.5u
I .43 2.'78 5.65 7.35
2.02
1.04 2.00 1.03
6.83 2.alg
1.20 1.0 8 3.27 1.40 1.8 5 3.9'7 o.62 2.22 4.00 0.56 o.70 0.51 I .09 1.97 0.82 3.65 t.2l u.54 111 2.21 tt.tt2 5.72 4.'79 10.70 1.32 tl t 2.61 2.O9 5.l9 6.04 1.86 1.86 3.55 2.U 2.U 6.32 6.20 6.20
2.43 4.00 1.20 t.01 3.26 1.47 1.85 1.67 o.62 2.22 4.00 0.56 0.68 0.48 1.09 1.97 0.82 3.65 l .2l
tr) 2.53
0.s3
Table3. Comparison of best RI'SD tor different docking methods. The best RitSD is the RtISD of the predicted ligand from the Xray atructure. For each protein-ligandcomptex,the best RMSDfound is repoded in bold.
4.4
Compariron of RMSD Docking Methods
for the Differnt
RMSD than the other methods. 4.5 Comparison of Simuletion Dilferent Docking Methods Tlme for the
The RMSD's reported in Figure 3 are the root mean squaredeviations of the heavy atoms of the predicted ligands from the corresponding lignds in their published complex crystal shuctures. For our MD-based docking, we present results of attempts with different numbers of trials: I10 with l0 trials per attempt and I20 with 20 trials per attempt. In general,we observe that for both 7 I 0 and I20, we get, on the average,lower
The main question we want to address in Table 4 is whether the high level of accuracyis also supportedby competitive executiontime when comparedwith the execution times of the other docking methods, Table 4 shows the average CPU time to complete a proteintigand docking for the ten proteins in Table I and for
Protein -lllpsur CytochromeP450".'' Neuraminidase Carboxypptidase L-Arabinose e-Thrcmbin Thermolysin Penicillopepsin Intestinal FABP ll Ca6onicAnhydmse
AutoDock DOCK l-lexX
ICM
GOLD
165
'n
8l lt4
lll
1' 10
'r20
391 291 620 624 353 1114 789 tt22 560 519
5l
29 98 88 421 l?0
412
26 82 72 92 83 65 77 29 88
65 40 99 t41 39 336 238 645 92
273
269 437
288 676
s00
840 489 388
203 l4tt
2'76 145 107
138 ))
ti05 t6l 0 tt46 1693 ll10 2220 I156 2313 766 l53i 2036 4073 t483 2966 2760 5520 t450 2900 1070 2 140
Table4, Comparison of averagetime simulations for different protclns and difforent docking meih' ods.
5 Computotional Plstforms for our MI)basedDocking

MD simulations are time-consumingbut are also accurute general techniqucs for the study of protein-ligand docking. The time needed by MD-based algorithms to screen larg sets of ligands (of th order of 10,000 molecules) makesthis approachprohibitive even on expensive supercomputers. On non-dedicated systems' even the docking of a single pmteinJigand complex might result in a time-to-solution on the order ofhours due to computing rcsourcecontentiotrs. The motivation to port existing applications to morc cost-efrectivedistributed systemslike dasktopgrids is not shong for such applicatiotrs unless more time-effective algorithms are designedand implemented. The ned for nen, algorithms that are more flexible and suitable for desktopgrids, but still accurate,is the motivation behind our searchfor the docking algorithm presentedin this paper. Docking attemptsofour MD-based trials. We of algorithm consistof sequences independent tlat attemptsfor evencomhaveobservedaud measured ptex ligands c.ith a large number of rotable bonds are characterizedby short simulation times, much shortr than t hour. By decomposingeach attcmpt into setsof independenttrials, we can fiuther increasethe computatiouat granularity of the algorithm. Using available desktop PC's simuttaneouslyto processeach trial, proportionally decreases time to solutiotr. Long computhe tation tasks, which are mor probable to be intefiupted by annoyeddesktopusers, should also be avoided. Our result showsthat we canensutethe time to solution to be equalto the time for a singletrial when a largeaumbor of
AuioDock DOCK
Flerx
ICM
OOI
Figure4. Compadson of docking accuracy (DA). The docking accuract/ is tho weighted sum of the fraction ofdocked attempts with acceptable acculacy (lower or equal to 2A and 3A). th diffrcnt methodsunder investigation. For our MDbaseddocking method,we considerthe averagetime of a single trial aswell as the time for an attemptof l0 and 20 trials. Again we consider the CaseB n Table 2 ts a referencecase. We observethat an atlcmpt of l0 hials is campletedin less than one hour sven for complex proteinJigand docking with a large mrutber of rotable degreesof freedom. In addition, each trial of each atand tempt is independent, therefore,the l0 hials canrutr at the sametime on difrcrent processorsin parallel. If enoughprocesson are available,the time for completing a protebJigand docking becomesthe time for a single triat, making our algorithm highly competitive with the other methods.
I I
desktopPCs is available. An acceptableaccuracycan be ensuredby sending out moro fiials than are neededfor the desired accuracy,and using the flIst trials to complete. Therefore, we conclude that our docking algorithm is well-suited for Intranet desktopgrid platforms (e.9., Entropia DCGrid F4l, Intuzion [5]) and on the Intemet (e.g.,Xtremweb [16], BOINC n7l). The conbination of our algorithm with such platforms, which might allow us to perform fast and accurate screening of very is large ligand databases, currently under our development and investigation.
[2] G. Wu, D.H. Rotrertson, C.L. Brooks III, ood M. Veith. Detailed Analysis of Crid-Based Molecular Docking: ,A.. Case Study of CDOCKIR-A CHARMm-Based MD Docking Algorithm, J. Conput. chemistry, (24).15491562,2003. [3] B.D. Bursuleya, M. Tohov, R. Abagyan, and C.L. Brooks IL Comperative Study ofseveral Algorithms for Felxible Lig8nd Dockig! . J. Comp.Aided Molecular Deri8n, 2003. in press. [4] G.M. Moni6, D.S. Goodsell, R.S. Halliday, R. Huey, W.E. Hart, R.K. Belew, atrd A.J. Ol$on. Aulomated Docking Using a Lemarkiatr Gerctic Algoritbm and and Empirical Binding Free EDergy Function. J. Comp. Chem.. 19i 1639-1662. 1998. [5] T.J.A. Ewing and I.D. Kuntz. Critical Evaluation of Sesrcl Algorithds for Autometed Molecula. Docking and Database Screening. ",| Comp. Chem., lEtllT61189. 1997. [6] M. Raiey, B. Kramer, T. Letrgauer, ad G.A. Klebe. A Fast Flexible Docking Method using an Incrmedel Construction Algorithrtr. I Mol. Biol., 261:470489, 1996. [7] R. Abagyan, M. Totrov, and D. Kuznetsov ICM - A New Melhod for Proteir Modeling and Design: Applications to Docking snd Shucturc Predictiotr ftom the Distorted Native Conformation. J. ComD. Chen.. 15:.488506, 1994. [8] G. Jones,P- Willett, R.C, Glen, A.R. Leach, and R. Taylor. Developmentand Validation of a Genetic Algoritbm for flexible Docking. ./. Mol. 8io1.,2671727-748, 1991. [9] B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. Stales,S. Swaminathan,and M. Kamlus. CHARMM: A Program for Macromolecular EnerBy,Mioimizatioo, and DynamicsCalculatiors. J. Comput.Chen., 4:181117, 1983. [10] X. Koog and C.L. Brcoks. lambda-dynamics: A new Approach to Free Energy Calculations. J, Chem.Phys, lI5t24l4-2423, 1996. lll A.D. MaoKqell Jr. ald et al. All-atom empiricsl PG. tential for Molecular Modeling and DyDamics Studiesof Proteins. I Pt,r. Chen. B,102:.3586-3616, 1998.
6 Conclusion
In this paper we preseDta MD and detailed force field protein-ligand docking algoriffn basedon a grid represqtrtatiodofthc prctein-ligand interactiousand soft-core potential. We prove that our docking method provides better docking accuracy tlan most of the other wellknown and commonly used docking techniques, displaying a successful dockingrateof70%. Basedon our time comparisons,we claim that the computatiotral time is no longer a justified rcason to avoid using detailed force field based docking tecbniques. Even for complex ligands, the completion time for a protein-ligand docking attempt of l0 trials is modest (less than one hour on a 930MHz processorfor ligands with large numbersofrotable bonds). Deshop gdd platforms are wg -suitgd for our accuute, fine-grained parallel algorithm for which each docking trial is shoft and independent.
Acknowledgments
Finalcial support from lhe NatioDal Institutes ofHealth Grant, GM37554, is greatly appreciated. FiDarcial support through the NSF sponsoredCenter for Theoretical Biological Physics (grant# PHY-0216576 aod 0225630) and the LJIS Fund arc also ackrowledged. Supported in part by the National Scietrce Foutrdation under awaids NSF EIA-9-75020 C'rad6 and NSF Cooperative Agreement ANI{225642 (OptlPuter), NSF CCR{331645 (VGTADS), NSF ACI-0305390, and NSF Resea.ch lnfiastructure GTa EIA-0303622. Support from Hewlett-Packar4 BigBangwidth, Microsoft, and Intel is also grotefilly acknowledged. We wish to thaDkth Sar Diego Supercomputer Cetrter at UCSD for giving us accessto the Meteor cluster.
u2l M. Vieth, J.D. Hirst, A. Kolitrski, and C.L. Brooks Itr. Assessing Ener$f Functions for Flexible Docking. I Cornp. Chemistry, l9(l 4'1: 612-1622, 1998. | u3l C.L. Brcoks III and ot al. LPDB: Ligard Protein DataBase.lpdb.Ecripps.edu/. [l4] Entropia. www.eDtropia.com. u5l EnFuzion. wwwaxceleon.cofi . [6] C. Fedak, C. cermain, Ndri V, aDd F. Csppello. Xtremweb: A Generic Global Computing System. In Pntc. of CCGRID 2001, Workshopotr Global Cornputing on PetaonalDevices, May 2001. [7] BOINC. boioc.berkeley.edu.
References
H. ul P. Fenara, Gohlk,D. Price,c. Klebe,and C.L. Brooks ltr. Assessing ScoringFunctions Proteinfor Liga[d InteDctions. Med.Chen.,2003.Slubmit],ed. J

Study of A Highly Accurate and Fast Protein-Ligand Docking Algorithm

Enviado por

Dados do documento

Descrição original:

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Study of A Highly Accurate and Fast Protein-Ligand Docking Algorithm

Enviado por

Direitos autorais:

Formatos disponíveis

I

The MD-based Docking Algorithm

Figure t. Our llD-based prteinligand algorithm,

3 Metrics 3.1 Accuracy

2000 4000 8000

5000 10000 20000

0.49 1.08 0.7 | I _7i 2.1'7 t.' 10 2.53

0.70 0.69 1.42 1.5 0 t.r6 l .14

0.46 o.44 2.53 6.00 l_tio t.56 t.04 0.92 L09

3.2s 0.76 0.61 3.trg 4.68 0.8tt

Compariron of RMSD Docking Methods

for the Differnt

AutoDock DOCK l-lexX

65 40 99 t41 39 336 238 645 92

5 Computotional Plstforms for our MI)basedDocking

Você também pode gostar