Você está na página 1de 16

Research Article

iMedPub Journals 2018


Chemical informatics
www.imedpub.com ISSN 2470-6973 Vol. 4 No. 1:1

DOI: 10.21767/2470-6973.100026

Every Jack has His Jill: Finding a Target for Inna Slynko1,2, Jan KF
Your Combinatorial Library Dreher1 and Andreas H
Göller1*
1 Bayer Pharma AG, Medicinal Chemistry
Abstract - Computational Chemistry, Wuppertal,
Pharmaceutical companies regularly run campaigns to evolve their proprietary Germany
chemical libraries which are among their most valuable assets. Ultimate goal with 2 Grünenthal GmbH, Computational
those library expansions is to address novel chemical space with maximal fit to Chemistry, Aachen, Germany
pharmaceutically relevant targets which is beyond just applying property or drug-
likeness filters. In this work we present a structured and highly automated process
to identify putative biological targets starting from any chemistry-driven virtual
or existing compound library. Multiple ligand similarity searches are performed
in ChEMBL ligand space, linking library compounds to targets from ChEMBL
*Corresponding author:
database. The results are presented to the computational chemist in a highly Andreas H Göller
intuitive and interactive manner. For a set of targets selected by a scientist, holo
crystal structures are automatically retrieved and prepared for docking. The co-  andreas.goeller@bayer.com
crystallized ligand, ChEMBL compounds and combinatorial library are then docked
by an automatic procedure. The scientist finally is provided with a holistic picture of
Tel: +49202365442
library-target fit hypotheses to draw his conclusions about relevant targets, library
adjustments, library re-designs and ideas for completely new virtual libraries.
Keywords: Library design; Target fishing; Automated workflow Bayer Pharma AG, Medicinal Chemistry
- Computational Chemistry, Wuppertal,
Germany.
Received: December 21, 2017; Accepted: December 26, 2017; Published: January 01,
2018
Citation: Slynko I, Dreher JKF, Göller AH (2018)
Every Jack has His Jill: Finding a Target for Your
Introduction Combinatorial Library. Chem Inform Vol. 4 No.
The chemical library belongs to the biggest research assets 1:1.
of any pharmaceutical company. Such screening libraries are
typically between one to five million compounds [1]. Whether
the full library or only subsets are tested in HTS campaigns and One way to address the question of target relevance is to start
how such subsets are composed depends on target areas, assay from known chemical matter and to apply core modifications like
designs and company’s strategy. HTS and especially in vitro and changes of ring size or type, or shifting nitrogen and functional
in vitro assays of individual compounds are costly in terms of groups. Alternatively, one can design libraries purely chemistry-
substance consumption. Therefore, all libraries bleed out. Instead driven, based on attractive chemical scaffolds, synthesis routes
of resynthesizing old compounds, companies set up campaigns or concepts like escaping from flatland, [6] giving diversity and
to evolve the libraries into new chemical space following one of serendipity a chance. Combined with IP space analysis both
three strategies, namely, buying from chemical catalogs, buying routes can yield viable libraries.
readily available proprietary compounds or designing novel
proprietary chemistry. Typical design concept for novel libraries We were now interested if it would be possible to find the
is to create structurally diverse compounds with Lipinski drug-like right target or target family for a subset of our internal library
[2] or lead-like [3,4] properties. designs, which were originally driven by feasible chemistry and
attractive novelty. Or otherwise, if it would be possible to derive a
Since chemical space is almost infinite with approximately 1060 rationale how to modify such a library design in order to tailor the
compounds with a molecular weight lower than 500 Da, [5] and
respective library to a specific target or target family. We expect
currently only about 10 to 20 million compounds relevant to drug
that a library designed with a target family in mind possesses a
discovery are covered by commercial sources and proprietary
repositories, the question arises: which of numerous imaginary higher chance to hit the relevant chemical space, especially, since
libraries are relevant and which not? there are many indications for existence of privileged scaffolds [7].

© Under License of Creative Commons Attribution 3.0 License | This article is available in: http://cheminformatics.imedpub.com/ 1
2018
Chemical informatics
ISSN 2470-6973 Vol. 4 No. 1:1

We know also that computational methods, especially high that without detailed SAR knowledge cannot be separated into
throughput methods like structure- or ligand-based virtual meaningful and chance models.
screening or target-family likeness filters, are far from perfect
Weighting the pros and cons of the former concepts we decided
and will at maximum provide certain enrichments. Therefore, we
for a hybrid approach. We filter down the published - highly
decided to combine computational methods, which provide us
incomplete and sparsely populated - pharmacological universe
with a high degree of automation and throughput, with optimum
by fast ligand-based methods to a manageable subset. We then
use of expert knowledge and guidance. Nevertheless, we have
process a user-selected subset of the ligand hit sets related to
to stress that starting libraries, as well as libraries designed with
specific targets by docking. Our approach is as far as possible
help of the process described in this article, have to be strictly
automated for efficient identification of potential biological
novel, which requires intervention of an expert and cannot be
targets with co-crystal structures. The general process starts
automated.
with multiple automated ligand-based similarity searches in
Hence the question arises: how to find the matching target for the the ChEMBL [21] database, which contains chemical structures
library, or at least for some library compounds. In the last years of small molecules with their associated biological test results
many researchers looked into this topic mostly from a different and targets. Consequently, grouping of hits based on biological
perspective, namely, how to control target selectivity of a lead target, extraction of structures from Protein Data Bank [22] via
compound and avoid adverse effects, [8-10] how to identify the accession codes and automated docking simulations are
hidden opportunities in drug repurposing projects, [11-13] or performed.
how to support the difficult but promising design of multitarget
The approach is novel in the way how multiple computational
drugs [14-16]. Despite other rationale for target fishing presented
methods are combined in an efficient process, providing the
here, additional information on potential off-target activity or
computational chemist with a holistic picture of potential hits
selectivity of compounds from a starting library is a welcome
based on the available knowledge. It is implemented in a way to
side-product.
automate the tedious manual work, to provide an expert with the
Computational target prediction methods published to date capability to interact with results and to allow him to concentrate
[13,17] can be classified as ligand-based, network-based, side- on decision-making.
effect-based, or protein-structure-based depending on the data
Despite a high degree of automation of this process, the crucial
used [18]. Ligand-based methods connect similarity measures
step will always be the final one, where the real value is generated
with binding profiles for similar compounds in order to predict
by the modeling expert, who will make decisions based on visual
potential targets. Network-based methods incorporate the
inspection and his experience in order to adjust the combinatorial
knowledge about ligand and target interactions, which are then
library to selected target(s) by adding, replacing or removing
represented as networks. Side-effect-based approaches utilize
chemical substituents, or exchanging a scaffold. As a result, one
the information about off-target activities of similar drugs.
or more novel targeted libraries can be designed.
Potential targets can also be predicted by protein structure-based
By our approach we will lose all those targets our library would
methods including docking, protein-ligand interactions or protein
show some activity on but where the published ligands are
binding site comparisons, but this is a tedious manual procedure
too dissimilar in 2D metrics. A part of those targets could be
solely based on profound expert knowledge.
“rescued” by direct docking into the complete crystallized target
Quite new is the inverse approach-to create ligand bioactivity space, but even then we would still miss some targets due to the
fingerprints encoding the hit status of compounds from HTS shortcomings of rigid receptor docking.
campaigns [19,20]. In combination with conventional ligand
We do not aim for the identification of a complete targetome for
fingerprints those allow to identify chemically similar ligands that
our library, but for the identification of targets that fit into the
should have similar bioactivity profiles.
pathways of our medical indications. We will therefore not aim
Ligand-based methods are fast and easy to use, but they are for the highest-ranked target, but for the one best fitting to our
limited to search spaces of highly similar compounds. To a certain project portfolio.
extent, they are able to extrapolate into new chemical space via
It is also important to understand that we do not describe a
scaffold hopping.
process of automated ligand- and target-based virtual screening.
Docking, on the other hand, is dependent on the availability of Instead, the similarity searches are applied as a coarse filter to
protein crystal structures. For about half of the targets relevant to identify targets from which the expert selects targets of interest.
pharmaceutical research there are no crystal structures available. Docking is applied to confirm target fit based on pose consistency
Docking, in principle, can identify new chemical matter, but it is between cocrystallized ligand, ChEMBL hits and docked library
challenging with respect to protein pre-processing and ligand compounds and has to be seen as a sharper filter to finally
ranking [18]. identify the most appropriate target for our library. In this paper
we present a concept and a first implementation of the process
Pharmacophore methods, finally, are somewhere in between.
that can be easily adjusted to individual needs, like adding a
To some extent they can extrapolate by scaffold or substituent
corporate database of chemical structures and biological data,
hopping. On the other hand, pharmacophore methods often
extending the range of similarity search methods, exchanging
provide the user with an overwhelming manifold of hypotheses
protein preparation and docking method or adding automated

2 This article is available in: http://cheminformatics.imedpub.com/


2018
Chemical informatics
ISSN 2470-6973 Vol. 4 No. 1:1

pharmacophore modeling. Though implemented in a commercial Database preparation


software solution, the described protocol can be also realized
using other tools and software. From the broad range of data from the scientific literature,
including biological activities for drug-like bioactive compounds as
Methods and Process Description available in the public database ChEMBL [25], information about
chemical structures, identifiers, assays and targets is extracted
The basic concept of our target-fishing approach relies on the and saved into the appropriate file formats for the similarity
“similarity principle”, [23] according to which similar molecules searches in step 2. (The data in this work were based on ChEMBL
exert similar biological activities. Therefore, a combinatorial version 14 (release from July 2012) comprising almost 14 million
library in its whole, its subsets or individual compounds, that are experimental results for about 1.9 million compounds, whereas
similar to known actives, should be able to point at targets of the current release 23 from May 2017 contains around 2.1
interest. Promising targets, which were identified indirectly using million compounds). The database structure of ChEMBL consists
ligand similarity, are then selected for further investigation via of about 50 tables, which are mapped by primary keys and
automated docking. Conceptually, this resembles the process of contain information about compound, source, drug properties,
experimental data, target, mechanism of binding, etc. In order to
experimental target validation using chemical probes.
access the most important entity types from the database, SQL
The automated protocol constructed and executed using the queries were constructed and implemented in Pipeline Pilot to
workflow software Pipeline Pilot [24] can be summarized into extract the data about compounds, targets, assays and activities,
four steps, namely, database preparation, similarity search, as well as adjustable filters for parameters like organism, activity
analysis and docking, as it is shown schematically in Figure 1. In a type, activity threshold and confidence score.
fifth step, the computational chemists will visually inspect results Further investigation of the ChEMBL database revealed that there
and draw informed decisions. are more than 3000 different activity types measured in hundreds
of different units. Among them the top-represented activity types,
which were used in our study, are potency, EC50, IC50, inhibition,
Ki. Moreover, grouping of compounds by organisms revealed
1621 species on which they were tested. Thus, we implemented
a number of default filters for the most represented activity types
(IC50, EC50, Ki, Kd), units (M, nM, µM, mM) and organisms (human,
mouse and rat) as well as for the activity threshold (10 µM). Those
filters can be easily set via Pipeline Pilot protocol checkboxes and
variables.
To ensure as much as possible that targets are assigned to correct
assays, only records with ChEMBL confidence score higher than
7 were selected. The confidence score is assigned during the
manual curation process by the data extractors and reflects
assay-target relationships. It ranges from 0 to 9, where 0 means
uncurated data and 9 equals to high degree of confidence.
The application of above-mentioned filters reduced the amount
of ChEMBL entries from 12.3 to 3.8 million, which represents
764,419 unique registered molecules. The compounds and the
information about targets and assays were saved into separate
files. Thus, all additional information was joined to compounds
after the similarity search. We apply a predefined hierarchical
file structure for the purposes of documentation and to facilitate
further re-analyses and follow-up studies. Finally, the extracted
ChEMBL data were converted into the appropriate structure
formats required for the chosen similarity methods as described
in the next step.
Similarity search
For each compound of a combinatorial library ligand-based virtual
screens against database compounds are performed. Multiple
methodologies are applied to make maximum use of different
Figure 1 Schematic visualization of the five workflow steps
starting from database preparation and ending with similarity measures. Final hit lists are combined by MAX-rank voting
docking results and visual interpretation. Here PP is as described by Baber et al. [26] and Whittle et al. [27].
Pipeline Pilot and SQL - Structured Query Language.
In this work we implemented three approaches, namely (i) atom-
© Under License of Creative Commons Attribution 3.0 License 3
2018
Chemical informatics
ISSN 2470-6973 Vol. 4 No. 1:1

based circular fingerprints ECFP4, (ii) non-linear Feature Tree and hyperlinks, showing the full target name and hit counts for
descriptor FTrees and (iii) DBTOP topomer search similarities. the different similarity measures. Since we are solely interested
Each of them represents structural and pharmacophoric features in targets with crystal structures, information about protein
in a different and complementary way. structure availability is also retrieved from RCSB Protein Data
Bank [22] and summarized in the table next to the bar chart, see
The extended connectivity fingerprints ECFP4 describe the
Figure 2a.
presence or absence of overlapping particular substructures [28].
The number 4 in the name corresponds to the effective diameter Furthermore, a click on any bar of the chart executes a Pipeline
of the largest feature, thus the largest possible fragment has a Pilot sub-protocol, which provides a second HTML report (Figure
width of 4 bonds. The Tanimoto coefficient is used as distance 2b) containing table and attached structure grid view with
metric for scoring. detailed information about the hits, e.g., chemical structure,
activity data, assay results or species on which they were tested.
DBTOP from Certara is a 3D similarity search where molecular
Moreover, the table area and the grid view are cross-linked and
structures are compared as sets of fragments (so-called
possess tooltips containing chemical structure and detailed assay
topomers), which are characterized by CoMFA-like steric shape
information. This gives the user a quick overview of a certain
and pharmacophoric features [29]. One single rule-based
target and its compounds as well as assists with further target
conformation is generated for each fragment and oriented by
selection. The desired targets can be preselected for docking in
open valence bond, while the rest is oriented again using a rule-
the next step using checkboxes.
based scheme. Aligned fragments are then compared by their
fields until the minimum topomeric difference between two Docking
molecules is identified.
Automated docking of library compounds, ChEMBL hits and
The BioSolveIT FTrees method calculates the feature tree cocrystallized ligand into the selected targets is performed. All
descriptor, which represents hydrophobic fragments and available PDB structures for user-selected targets are downloaded
functional groups of the molecule and the way these groups are by the workflow, i.e., often multiple crystal structures per target.
linked together [30]. The descriptors of two molecules are then For instance, the amount of structures deposited in PDB for cyclin-
compared to each other. dependent kinase 2 is more than 300. This poses a question how
ECFP4 and FTrees are available as Pipeline Pilot components, to prioritize the crystal structures for docking in an automated
whereas DBTOP was run from the command line using Pipeline way. One quality criterion for a crystal structure, which can be
Pilot “Run on Server” component. Since we aim for target fishing easily accessed, is its resolution. On the other hand, docking
and idea generation, we accept low overall ligand similarities and may be still not successful, when it is done into a wrong protein
therefore limit hit lists of the individual searches by the maximum conformation. Since residues of apo-structure (without bound
numbers of hits and not by similarity thresholds. ligand) may occupy parts of the binding pocket, we decided to
limit our docking to holo-structures (ligand-bound). Furthermore,
The implemented Pipeline Pilot protocol allows a user to select the presence of a ligand simplifies automated grid generation.
similarity search methods via checkboxes and to set individual Thus, top N holo-structures with the best resolution are selected
parameters for similarity threshold or number of top-hits to for each target, where N is a number specified by the user. In
save. It automatically combines results of similarity searches and case of multiple chains, always chain A is saved for each structure
reports hits, their similarity scores as well as targets, activity and in order to simplify structural alignment. Alternative selection
assay data. schemes could include target selection by ligand similarity or
pocket shape diversity.
Analysis and selection
Ligand preparation was done in two steps. First, protonation
Hits are grouped based on the targets against which they show
states at pH 7.4 for co-crystallized ligand, ChEMBL hits and library
activity. The results are presented as Pipeline Pilot HTML report
compounds were calculated using the pKa module co-developed
comprised of an interactive bar chart, representing top targets
by Bayer and SimulationPlus [31] and implemented as Pipeline
and numbers of hits per target (Figure 2a).
Pilot component “ADMET predictor” [32], while ring conformers,
The ranking order implemented here is disputable, since currently tautomers and stereoisomers were generated using Schrödinger
targets are sorted by number of hits identified, which yields a LigPrep utility, release 9.8.
certain bias towards targets with higher numbers of congeneric
An automatic docking procedure was applied using the
compounds reported. Since the rank score bears a certain risk
Schrödinger script XGlide.py (version 3.7; v45017). The script
of missing interesting targets with small hit clusters, the user is
performs automatic protein alignment and preparation, grid
able to set a threshold for the number of targets retrieved. Up
generation, re-docking of crystal structure ligands as well as
to now, for each input library we were able to identify a set of
docking of other compounds (here, library compounds and
interesting targets. Nevertheless, alternate scoring schemes
ChEMBL hits). For each selected target a separate directory
taking into account, for instance, overall numbers of compounds
is created containing subdirectories for crystal structures,
tested, activity ranges, and numbers of congeneric series will be
prepared ligands and docking results. The script is executed
evaluated.
from the command line using Pipeline Pilot component “Run on
For convenient overview the bar chart is equipped with tooltips Server”. The following docking parameters were applied: protein

4 This article is available in: http://cheminformatics.imedpub.com/


2018
Chemical informatics
ISSN 2470-6973 Vol. 4 No. 1:1

b)

Figure 2 The example of Pipeline Pilot protocol results: a) HTML report with cross-linked bar chart and 20 top-ranked targets derived from the
combination of three similarity search methods (DBTOP, ECFP4 and FTrees) using the designed oxadiazoles library as a starting point,
see Results Section for more details; b) example of an HTML report with cross-linked table and grid view of the hits for one specific
target, here for HTH-type transcriptional regulator EthR.

alignment, preparation and grid generation were turned on; ligands and identifies commonalities and differences in the
ligand preparation was set to false, Glide standard precision (SP) binding modes to individual crystal structures of each target. In
was selected as the scoring function. The results for each target the second step, he inspects docking of ChEMBL hits to verify the
were saved as pose viewer files, which at the end are copied into interaction hot spots. Third, he analyzes the library compounds
one folder for the analysis. with good and bad docking scores and judges the plausibility
Inspection of the binding modes obtained. Finally, he will either consider
biological testing of library compounds on targets of interest,
The final step in the process is by intention not automatic, and
or modifying the library proposals in order to optimize their
probably can never be. The computational chemist loads docking
interactions to a certain target, or generation of a completely
poses for targets of interest for visualization and analysis. In the
first step he inspects the quality of re-docking of co-crystallized new library proposal.

© Under License of Creative Commons Attribution 3.0 License 5


2018
Chemical informatics
ISSN 2470-6973 Vol. 4 No. 1:1

Results searches are performed to identify a shortlist of targets for


selection by the expert, not to identify the rank 1 targets.
Validation of similarity search process
Process application examples
Of the three purely automated technical steps, namely database
preparation, similarity search and docking including grid The process described in Methods and Process Description was
preparation, the most critical one for the overall performance developed to identify potential targets for existing chemistry-
is the identification of targets via the similarity searches. We driven combinatorial library proposals and to modify the proposals
therefore preformed a retrospective study in ChEMBL to test for in a way that they can directly contribute to early projects at Bayer
the performance of finding targets via searches with libraries Pharmaceuticals Global Drug Discovery. The process is applied to
known to be active on those targets. in-house libraries that are proprietary and cannot be disclosed
here. Therefore, we had to design a proof of concept case study
For this, we extracted ChEMBL data for compounds tested on for this publication. The downside of this approach is that we are
all species with reported IC50, EC50, Ki, Kd and activity units of not able to present experimental data for our prospective library
nM or µM. No activity threshold filter was set. The applied proposals (the starting library or the derivatives for the targets we
filters reduced the number of ChEMBL entries to 851,915 which hit). As a starting point we chose a publication from the Journal
constitute 327,520 unique molecules and 17568 different DOC_ of Medicinal Chemistry from 2012 which describes structure-
IDs [Fussnote einfügen: DOC_ID, TARGET_ID, MOLECULE_ID all based drug design for a series of potent 1,2,4-oxadiazoles, which
have the same identifier name CHEMBL_ID in different tables of target M. tuberculosis transcriptional repressor EthR (see Figure
the ChEML database]. From those, 633 sets based on identical 3a for examples) [33]. We designed a combinatorial library, that
DOC_ID were derived containing between 100 and 150 molecules is similar but distinct to the published compounds from ChEMBL,
each, representing our chemical libraries. This is justified by the with the aim to demonstrate that the developed methodology
fact that compounds from one publication normally more or less is able (i) to recover the compounds from the publication and to
represents a congeneric series. The 633 sets are connected to show that EthR protein can be identified among the top targets,
264 different TARGET_IDs. We finally selected 22 DOC_ID sets (ii) to identify potential new targets for our example library, (iii)
which share their TARGET ID with 5 to 7 other documents (the to provide examples of target-fishing-based library modifications
distribution runs between 1 and 13 different documents per and (iv) to provide examples of the short-comings of such a fully
TARGET_ID). automatic approach and to highlight the importance of expert
This setup allows us to perform - using the compounds from one interaction.
document - a “library-based” similarity search. By those similarity In particular, we introduced three changes to our library with
searches we should then be able to re-find the target the search respect to the library from the publication. First, we modified the
library is known to be active on, only based on similarity of the piperidine ring to a cyclo-hexyl, i.e., shifted the nitrogen by one
library compounds to the compounds in other documents on the position. Second, we replaced the aliphatic lipophilic side chain
target. by various R2 groups of different size, polarity and charge state,
Detailed results are provided in Table S1 in ESI. The median connected via nitrogen or amide bonds. Third, we introduced
numbers of documents identified are 45 for the combined search alternative lipophilic R1 groups at the only point of variation
and 10, 43, and 7 for ECFP-4, DBTOP and Ftrees, respectively.1 from the published library. Core definitions and examples for the
publication and the library compounds are shown in Figures 3a
We are thus always able to identify the targets of the library and 3b, respectively.
compounds even though the median similarities to the ChEMBL
compounds are as expected quite low with 0.33 for ECFP-4, 139 Example of database preparation: Step 1 of the workflow is to
for DBTOP and 0.88 for Ftrees. With two exceptions all targets search for similar compounds and their associated targets using
were identified by all three methods. Tyrosine-protein_kinase_ the designed library as a reference. For now, ChEMBL data were
SYK (CHEMBL2599) was not found by ECFP-4 and Ftrees and extracted for compounds tested on all species with reported IC50,
Cytochrome_P450_2D6 (CHEMBL289) by ECFP-4. EC50, Ki, Kd and activity units of nM or µM. No activity threshold
filter was set. The applied filters reduced the number of ChEMBL
Thus, we are consistently able to identify the target we were entries to 851,915 which constitute 327,520 unique molecules.
looking for, but not always at rank 1. Nevertheless, mean ranks
of the test targets are 3.5 for ECFP-4, 8.7 for DBTOP, 5.8 for Ftrees Example of similarity search: In step 2, similarity searches are
and 1.4 for the consensus rank, which always ranks the search performed. We used all three currently implemented methods,
target rank 1 or 2. namely DBTOP, ECFP4 and FTrees as described in Methods. 400
highest rank hits were saved for each metric, and additionally a
The hit rate and especially the ranking of the search targets is consensus rank was calculated. The diagram in Figure 2a gives
even better than the expected outcome, i.e., that the similarity
the list of the top 20 targets, associated with the results of the
similarity searches, as interactive bar chart. Table S1 of Supporting
1
During the step-wise preparation of the library sets only representative Information provides more detailed information about target
subsets were kept via first occurance filters. This resulted in data ranks according to the three similarity search methods and
reduction and therefore the final numbers of DOC-IDs per target were
consensus rank; additionally, it lists the numbers of identified
always lower than the numbers in the unfiltered dataset. These results
in higher numbers of documents retrieved.
hits and PDB structures for each target. The targets in Table S1

6 This article is available in: http://cheminformatics.imedpub.com/


2018
Chemical informatics
ISSN 2470-6973 Vol. 4 No. 1:1

Figure 3 Schematic representation of cores and example compounds for: a) M. tuberculosis transcriptional repressor EthR inhibitor series
from the publication of Flipo et al. [28]; b) combinatorial oxadiazole library.

are sorted by descending number of ligands identified by ECFP4 compounds from the publication, whereas FTrees found only 2
similarity search. As mentioned earlier, the implemented ranking and DBTOP none, underlining the necessity to apply multiple
by number of hits per target may be biased towards the targets ligand-based search methods to obtain the complete picture.
with large congeneric series. DBTOP is based on steric and pharmacophoric fields of the
whole molecule and therefore is more susceptible to larger size
Example of analysis: Step 3 is the first of two expert intervention
differences between query and database molecules than FTrees,
steps. Target selection could be done automatically based on
which abstracts the molecular fragments into pharmacophoric
their ranks, but manual selection will allow to concentrate on
representations, or ECFP4 circular fingerprints, where the hits
targets relevant in the context of a company’s research portfolio.
are dominated by occurrences of fragment features. Depending
The transcriptional repressor EthR was ranked number 7 by on library, contributions of different methods will differ. Some in-
the consensus score, which combines the results of the three house library screens, for instance, were dominated by DBTOP
similarity search methods. The scoring according to ECFP4 method hits. It is a priori not obvious which similarity metric will dominate
ranked EthR on position three. ECFP4 was able to identify all 33 in the consensus hit list.
© Under License of Creative Commons Attribution 3.0 License 7
2018
Chemical informatics
ISSN 2470-6973 Vol. 4 No. 1:1

Figure 4 shows the example hits identified by different similarity finally consistent docking of the library compounds or similarly
search methods to underline this assumption. It is worth to note decorated subsets thereof. We provide docking scores as a means
that similarity scores (see Table S2 of Supporting Information) as of further confirmation of consistent placement, but not as a filter
expected are quite low, pointing out that chemical modification or design criterion per se. High docking scores are a strong hint
of the library compounds guided by the final docking step might for important interactions to the target matched, whereas low
be needed. scores are not always correlated to weak binding interactions.
As expected, the numbers of hits for the different search methods Helix-Turn-Helix-type (HTH-type) transcriptional
differ. But in addition, also the numbers of PDB structures retrieved regulator EthR
differ. For instance, 33 PDB structures of 11-beta-hydroxysteroid
dehydrogenase 1 are found using only ECFP4 (see Table S1), Currently there are 23 protein structure entries in RSCB protein
whereas the combination of three similarity methods retrieved data bank based on UniProt ID accession code P9WMC1
38 PDB entries. The reason for this lies in the fact that all ECFP4 (Mycobacterium tuberculosis). Since the number of structures for
hits are annotated with UniProt [34] identifier P28845 (human) docking is actually a compromise between expected information
whereas the combination of ECFP4 and FTrees resulted in hits, gain and effort, two structures for docking were automatically
which were tested on human and mouse 11-beta-hydroxysteroid selected from the 17 holo-structures available, based on crystal
dehydrogenase 1 (UniProt identifiers P28845 and P50172, structure resolution. By default, we process two different crystal
respectively). While the human sequence shares 79% identity structures since modelling experience tells that using multiple
to the mouse orthologue, there is high level of conservation of target structures for rigid docking reduces the risk of missing
amino acids in the binding site. All ECFP4 hits share the same important target information. We later added one additional
oxadiazole motif while FTrees identified two additional motifs structure, namely 3O8H, due to its different pocket shape and
(Figure 5). Again, it is strongly emphasized that it is advantageous ligand-binding mode.
to employ multiple ligand similarity metrics. The hits found by ECFP4 are both agonists and antagonists with
Our proof-of-concept target EthR is rank seven by consensus best EC50 of 60 nM and IC50 of 400 nM, respectively, i.e., highly
score and rank three by ECFP4 similarity search. In the following active compounds.
we will analyze the two top-ranked targets in more detail (see G1M: An example where library fits well into the target: Docking
also Table S1), together with our target of interest, EthR (which of the library compounds into the first crystal structure 3G1M
would resemble the real-life situation with some targets in the with a resolution of 1.7 Å yields in high docking scores and
list not being relevant for the current portfolio. poses comparable to the co-crystallized ligand (IC50 of 500 nM,
Example of docking: For step four we selected the two retrieved from PDB Bind [35]). An additional hydrogen bond
top-ranked targets for docking, namely, top-ranked target to Asn176 can be observed between EthR and some of library
metabotropic glutamate receptor 5, second-ranked receptor compounds containing tertiary amine or amide linker attached
smoothened homolog, and our proof-of-concept target HTH- to the oxadiazole-cyclohexane core (an example can be seen
type transcriptional regulator EthR which is ranked seventh. in Figure 6). In contrast, the co-crystallized ligand, which has
The docking of our library compounds, ChEMBL hits and co- an oxadiazole-piperidine scaffold, is missing a hydrogen bond
crystallized ligands was performed using the fully automated donor at this position. Moreover, the analysis of the binding
XGlide procedure as described in Methods. A maximum of 2 pocket around the ligand can provide further suggestions for
crystal structures per target were retrieved automatically. We compound modifications, e.g., for extended interactions into the
had to extend the set by one more structure in the case of EthR, hydrophobic pocket formed by Met102, Val152, Leu90.
as described in the following. 3Q0W: differences in protein conformation and incomplete
Our decision objective for target fit is correct re-docking of the binding site setup: In contrast, docking into the second EthR
co-crystallized ligand, consistent docking of the ChEMBL hits and structure (co-crystallized ligand has Ki of 400nM [35]) led to

Figure 4 An example of two ChEMBL hits obtained by similarity search for the designed oxadiazole library using different similarity methods -
compound 7 (inhibitor of anandamide aminohydrolase) was identified by DBTOP similarity and compound 8 (inhibitor of cytochrome
P450) by FTrees and ECFP4.

8 This article is available in: http://cheminformatics.imedpub.com/


2018
Chemical informatics
ISSN 2470-6973 Vol. 4 No. 1:1

Figure 5 Exemplary ECFP4 hit 9 for human target 11-beta-hydroxysteroid dehydrogenase 1 and structurally different FTrees hits 10 and 11 for
mouse protein.

Figure 6 HTH-type transcriptional regulator EthR (3G1M, light grey representation) with co-crystallized ligand (cyan) and docking solution for
one library compound (magenta, glide SP score=-12.40).

low-scored poses for our library compounds. It turned out that in comparison of the two EthR crystal structures (PDB codes
a cocrystallized glycerol molecule, that had not been removed 3G1M and 3Q0W, the superimposition is shown in Figure S1, see
by the automated protein preparation, was situated deep in the Supporting Information). Slight but pronounced differences can
binding site, establishing hydrogen bond to Asn176 and blocking be observed at the loop region (residues Asn93-Asp98), where
ligand entry. the flip of Pro94 is accompanied by narrowing the entry channel,
which sterically hinders the placement of substituents towards
After its removal, docking of all compounds was possible.
this loop in 3G1M.
Nevertheless, the poses are still quite inconsistent. The amide
moiety for about half of the poses is located deep in the pocket 3O8H: Alternate binding mode: Our library was intentionally
and makes hydrogen bonds to Asn176 and Asn179, analogously designed to be chemically similar to the EthR inhibitor BDM41906
to the 3G1M dockings shown in Figure 6, and for the other half [33] (PDB ID: 3SFI). 3G1M and 3SFI have the same overall shape,
it points out of the pocket. Such differences can be explained the library compounds dock consistently into both pockets
by conformational flexibility of the protein, which can be seen (results are not shown).

© Under License of Creative Commons Attribution 3.0 License 9


2018
Chemical informatics
ISSN 2470-6973 Vol. 4 No. 1:1

Nevertheless, closer inspection of EthR structures revealed a for our designed oxadiazole library. We have also demonstrated,
second set of crystal structures with considerably larger binding that further optimization strategy largely depends on the choice
pocket. Such pocket enlargement is mainly caused by the flip of of EthR crystal structure, since the pocket residues are the subject
side chains of Thr121, Gln125, Trp138 and Phe184 (see Table S3 of conformational changes. Based on the docking results from
for comparison of available EthR crystal structures). both pocket shapes, we gained worthwhile additional information
Figure 7a shows the alignment of 3G1M and 3O8H along with about flexible and rigid subpockets and key interaction features.
interaction volumes generated by SiteMap [36]. As expected, If it were for our library extension campaign, we would now,
cross-docking of the 3O8H ligand (IC50=580 nM) into the rigid based on the target information, slightly optimize the decoration
3G1M receptor, without taking into account any induced fit of the initial library and additionally design a second library that
effect, yields a completely different and wrong binding mode, targets the deep cavity available in 3O8H. We would cross-check
where the aromatic sulfonamide is pointing out of the pocket the design for IP space and if necessary iteratively adjust to create
(see Figure 7b). novelty.
About two thirds of the library members dock consistently to Metabotropic glutamate receptor 5
BDM41906. About one third, due to the pronounced pocket
The highest ranked target according to ECFP4, the metabotropic
differences, dock inconsistently. Library members from both sets
glutamate receptor 5, is a class C G-protein-coupled receptor
ignore the additional cavity available in 3O8H.
responding to the neurotransmitter glutamate. There is only
In summary, we were in fact able to identify EthR as a potentially one holo structure (PDB ID 4OO9) identified in PDB for the
interesting target based on ligand similarity and docking results transmembrane ligand-binding domain, since earlier structural

a)

ab)

Figure 7 a) Alignment of crystal structures 3G1M (green residues) and 3O8H (cyan residues), protein ribbons are depicted in light grey. Amino
acids with different side chain orientations responsible for change in pocket shape are shown as sticks. SiteMap [31] generated
surfaces are shown in blue mesh for 3G1M and magenta mesh for 3O8H; b) Overlay of the crystallized 3G1M ligand (green), the
crystallized_3O8H ligand (cyan) and the docking pose of the 3O8H ligand in 3G1M (magenta).

10 This article is available in: http://cheminformatics.imedpub.com/


2018
Chemical informatics
ISSN 2470-6973 Vol. 4 No. 1:1

studies had been restricted to the amino-terminal extracellular piperidine-amides, piperidine-sulfonamides and spiro-hexyl-4,5-
domain, providing little understanding of the membrane- dihydrooxazoles (see Figure 8 for examples).
spanning signal transduction domain. 4OO9 is co-crystallized in
complex with the negative allosteric modulator, mavoglurant. 4OO9: Failure of the automatic procedure: All steps of the
automatic workflow technically proceeded well and compounds
The similarity searches for the library compounds identified were successfully docked. However, a closer look at the crystal
in total 160 agonists and antagonists of the metabotropic structure 4OO9 revealed that during automated protein
glutamate receptor 5 using consensus scoring, with best affinity
preparation and docking, the docking grid was positioned
values of EC50=5 nM, IC50=130 nM, and Ki=150 nM. The ECFP4
method ranked this target at the top position, while FTrees around a co-crystallized small organic molecule coming from the
ranked it at the position three with 149 and 25 inhibitors being experimental conditions, namely oleic acid, and not around the
identified, respectively. There were no metabotropic glutamate allosteric modulator mavoglurant [37]. Thus, the docking was
receptor 5 inhibitors among top 25 targets identified by DBTOP performed into the wrong pocket (see Figure S2 of Supporting
method. The hits represent different structural clusters such as Information).

Figure 8 Representative hits for metabotropic glutamate receptor 5.

Figure 9 Example docking solution of library compound (magenta ligand) using manual grid set up (glideSP=-10.33) into metabotropic
glutamate receptor 5 structure 4OO9 (protein is shown as light grey cartoon). The crystal structure ligand is shown as cyan sticks.

© Under License of Creative Commons Attribution 3.0 License 11


2018
Chemical informatics
ISSN 2470-6973 Vol. 4 No. 1:1

The example of 4OO9 shows that the fully automated docking transmembrane domain together with antagonist LY2940680 15
procedure has its drawbacks. Protein preparation and docking (see Figures 10 and 11), which binds the extracellular end of the
setup require user inspection and in certain cases manual seven-transmembrane-helix bundle via extensive contacts to the
correction. The effort nevertheless is acceptable, since the expert loops. Redocking reproduced the binding mode with an RMSD of
should be knowledgeable of the target in order to understand 0.52 Å.
and to judge the observed ligand interactions. On the other hand,
Docking into Smoothened homolog resulted in many well-scored
one could implement a mechanism to retrieve information about
solutions for the ChEMBL hits and library compounds with
the actual ligand and its binding mode, and use it during the
consistent docking poses (see Figure 11 for examples).
protein preparation step.
Instead of the conserved hydrogen bond between the carbonyl
Docking into the manually prepared binding site reveals that our
group of ligand and Asn219, which is observed for most of the
library compounds exhibit numerous interactions to the receptor
ChEMBL inhibitors and LY2940680, the library compounds form
similar to the crystal structure ligand, e.g., hydrogen bonds to Asn-
one or two (e.g., ligands with positively charged aliphatic ring like
747 and Ser-809, and extend their interactions deeper into the
compound 16, see Figures 10 and 11) additional hydrogen bonds
pocket lined out by Arg-648 and Val-740 (see Figure 9), which can
with the backbone carbonyl of Tyr394.
be further analyzed to guide possible compound modifications.
On the downside, most of the library compounds do not pi-stack
Smoothened homolog to Phe484. Exceptions are, for instance, sulfonamides like 17 (see
The Smoothened (SMO) receptor is a key signal transducer in the Figure 11). One of oxadiazole nitrogens of library compounds
Hedgehog (Hh) signalling pathway. SMO is classified as a class usually participates in hydrogen bonding to Arg400 analogous to
F (frizzled) G-protein-coupled receptor (GPCR). It contains the the phtalazine nitrogens in the crystal structure [38], but none
conserved seven-transmembrane helical fold common to the of the library compounds is able to fill the hydrophobic pocket
class A GPCRs and an unusually complex arrangement of long occupied by the phtalazine core.
extracellular loops stabilized by four disulphide bonds. The observations about key interactions of LY2940680 15 and
The similarity search for the library compounds identified overall the ChEMBL compounds provide us ideas for possible library
111 SMO inhibitors with a best IC50 of 16 nM, using consensus modifications in order to target SMO binding pocket optimally.
scoring. 103 hits arose from ECFP4 search and 20 hits from Figure 11b shows an example of hybrid compound 18 which has
FTrees. All hits are piperidine-amides or piperidine-ureas, but the oxadiazole replaced by phtalazine core while keeping the
again FTrees was able to detect more diverse compounds. larger p-cyanophenyl and the positively charged pyrrolidino-
4JKV: Optimizing the library for the target: The 2.5 Å resolution amide. The Glide SP docking score for 18 (-12.39) is virtually
crystal structure of the human SMO receptor contains the identical to 15 (-12.66). Alternate hetero-bicycle replacements

Figure 10 SMO inhibitor crystal structure ligand LY2940680 15 (PDB code: 4JKV), two representative library compounds 16 and 17 and the
modified hybrid compound 18.

12 This article is available in: http://cheminformatics.imedpub.com/


2018
Chemical informatics
ISSN 2470-6973 Vol. 4 No. 1:1

a)

a)

Figure 11 Docking into the smoothened (SMO) receptor (4JKV: protein is shown as light grey cartoon). a) redocked LY2940680 15 (cyan
sticks) and docking poses of two library compounds 16 and 17 (green and magenta); b) LY2940680 15 (cyan) and the docking
pose of modified hybrid 18 (green sticks), the phtalazine cores of both compounds are well-aligned.

© Under License of Creative Commons Attribution 3.0 License 13


2018
Chemical informatics
ISSN 2470-6973 Vol. 4 No. 1:1

also fit well. Aromatic headgroups linked via sulfonamide or amide inspection by a computational chemist, who can further trigger
like in compound 17 (glideSP=-11.47), on the other hand, are able library modifications or re-design. The system is easy to use and
to participate in pi-stacking with Phe484. Overall, from the point it is highly productive. The workflow is modular and can easily
of view of target interactions, there is a bunch of options, with be extended to alternate database sources, similarity metrics,
similar but also better Lipinski properties than LY2940680 having hit prioritization algorithms, or docking protocols. Due to its
a molecular weight of 512 Da and an AlogP of 4.56. accessibility, the ChEMBL database was chosen as a source of
biological and chemical information. However, incorporation of
Conclusion alternate data sources is obvious.
When planning for the extension of a compound library, one As expected, the first part of the process, which is strictly ligand-
is confronted with a universe of synthesis options. The only based, is highly reliable and fast. The use of multiple similarity
limitations, thus, are one’s own creativity, lab and budget metrics is advantageous since various approaches represent
resources in order to transform the ideas into chemical libraries. chemical similarity differently, and the consensus-based
Therefore, a rational concept to explore the options and pick assessment serves as a good basis for more detailed analysis of
the libraries with a certain probability to hit biological targets is hits and their corresponding targets and for inspiring creativity
desirable. Expert knowledge can guide the planning procedure. in library design. The platform is open for the incorporation
However, in such case the library design can be limited to the of alternate methods, e.g., shape-based screening or
person’s experience around the projects he has worked on. pharmacophore fingerprints. The current target ranking, which
Metrics like ligand efficiency allow to stay in an attractive property is based on the number of hits identified, is not optimal, since it
profile range but do not assist the selection of compounds favors large congeneric series. Thus, further modifications to the
amenable to target families of interest. Although drug-likeness or protocols are ongoing work.
target class-likeness scores take into account overall similarity to
known drugs or actives, substructure or global pharmacophore It stands to reason that common challenges of structure-based
features are quite rough estimates of target family fit. drug design are especially relevant for an automated procedure,
and special attention should be given to it. One of issues
In this paper we therefore describe the productive implementation concerns the assignment of ligand protonation states, which
of a concept aiming, on one hand, to identify putative targets for nevertheless can be reliably estimated by modern pKa predictor
a chemistry-driven library proposal and, on the other hand, to software. Correct stereochemistry is more problematic. There
identify options for compound modifications in order to create are cases where stereochemistry of a PDB ligand is ambiguous;
new libraries better fitting to certain targets. This article reports stereochemistry of ChEMBL compounds is not always explicitly
on a designed virtual combinatorial library and hits identified by defined, and library compounds may exist as racemates or with
the workflow, as well as on library modification ideas without the unknown configuration depending on a synthesis route. The
desirable proof of synthesis and experimental testing. best compromise here is to enumerate relevant stereoisomers
The real in-house examples cannot be disclosed here, and and to let the binding pocket decide. Finally, we are faced with
the examples shown will not trigger any synthesis and testing incorrect bond orders in a PDB ligand and unknown tautomer
at Bayer. Currently we are still not able to provide significant forms of ChEMBL or library compounds. Tautomerism is an issue
statistics about success rates of the described workflow due to its still lacking a sound solution. It is dealt by rule-based generation
novelty and the long turn-around times for the process of library and docking of sets of tautomers.
design, out-sourced chemical realization, registration and testing. Another issue concerns the protein preparation step. As we
Our workflow aims to automate all tedious time-consuming have shown in this paper, automatic preparation will sometimes
technical steps and allow to concentrate on rational design. We detect a wrong binding site or not remove small co-crystallized
always start with a chemistry-driven library carefully checked substances. In the current XGlide implementation, all crystal
for novelty and end up with a proposal that again is checked for waters are removed. Therefore, a fraction of automatically created
novelty as a part of the design procedure. results has to be discarded and manually re-processed. Even if
there is still room for improvement, we consider that currently
The workflow is divided into a set of protocols implemented in
XGlide is one of the best solutions for automatic preparation,
Pipeline Pilot that control the crucial steps, run automatically
protein alignment and docking.
and require minimum user interaction which is productively
used at Bayer Drug Discovery. The implementation described The third step of docking and scoring also has its limitations which
in this paper compares a virtual compound library to the are well described in the literature. A consistent binding mode of
chemical space represented in ChEMBL by multiple ligand-based library compounds is a necessary but not sufficient condition for
similarity metrics, retrieves ligand and target information and a compound binding to a target, especially since docking scores
presents the results in an intuitive representation to an expert, are often misleading. These issues together with limitations of
who then decides whether to proceed with ligand design for previous steps, e.g., complete removal of waters that sometimes
targets of interest. The protocol automatically retrieves PDB provide important contacts to a target, are the main pitfalls of the
structures and sets up docking runs for the cocrystallized ligand, automatic procedure. Thus, close visual inspection by an expert,
the ChEMBL compounds and library structures. The most time- who is knowledgeable of a target, will finally allow to judge the
consuming step is, by design, the final one, i.e., the visual relevance of results.

14 This article is available in: http://cheminformatics.imedpub.com/


2018
Chemical informatics
ISSN 2470-6973 Vol. 4 No. 1:1

One could argue whether it is worth to use an automatic process structure if parallelized (depending on the amount of compounds
with its numerous limitations. In our opinion, the advantages by to be docked).
far outweigh the risks. The process allows an expert to set up a
query very easily and to put his time on analysis and re-design A final word to the expected output
of the library. Walking through a full process takes about ten With all known algorithmic and data quality limitations a final
to thirty minutes for a set-up, about 4-8 h for data extraction library and its assignment to a target will always be “only” an
and similarity search, and about 2 to 3 h for docking per crystal educated guess for a library with significantly enhanced chance
to hit a target. Nevertheless, we feel that it is worth the effort.

References 17 Jenwitheesuk E, Horst JA, Rivas KL, Van Voorhis WC, Samudrala
R (2008) Novel paradigms for drug discovery: computational
1 Schamberger J, Grimm M, Steinmeyer A, Hillisch A (2011) Bigger multitarget screening. Trends Pharmacol Sci 29: 62-71.
Data, Collaborative Tools and the Future of Predictive Drug Discovery.
18 Schomburg KT, Bietz S, Briem H, Henzler AM, Urbaczek S, et al. (2014)
Drug Discov Today 16: 636-641.
Facing the challenges of structure-based target prediction by inverse
2 Lipinski CA, Lombardo F, Dominy BW, Feeney PJ (2001) Experimental virtual screening. J Chem Inf Model 54: 1676-1686.
and computational approaches to estimate solubility and
19 Petrone PM, Simms B, Nigsch F, Lounkine E, Kutchukian P, et al.
permeability in drug discovery and development settings. Adv Drug
(2012) Rethinking molecular similarity: comparing compounds on
Deliv Rev 46: 3-26.
the basis of biological activity. ACS Chem Biol 7: 1399-1409.
3 Hann MM (2011) Molecular obesity, potency and other addictions in
20 Riniker S, Wang Y, Jenkins JL, Landrum GA (2014) Using information
drug discovery. Med Chem Comm 2: 349-355.
from historical high-throughput screens to predict active compounds.
4 Lobell M, Hendrix M, Hinzen B, Keldenich J, Meier H, et al. (2006) In J Chem Inf Model 54: 1880-1891.
silico ADMET traffic lights as a tool for the prioritization of HTS hits.
21 ChEMBL (2016) Available from: https://www.ebi.ac.uk/chembl/
Chem Med Chem 1: 1229-1236.
(Accessed on: January 05, 2018).
5 Bohacek RS, McMartin C, Guida WC (1996) The art and practice of
22 Protein Data Bank (2016) A Structural View of Biology. Available
structure-based drug design: A molecular modeling perspective.
from: http://www.rcsb.org/pdb/home/home.do (Accessed on:
Med Res Rev 16: 3-50.
January 05, 2018).
6 Lovering F, Bikker J, Humblet C (2009) Escape from Flatland:
23 Johnson M, Maggiora GM (1990) Concepts and Applications of
Increasing Saturation as an Approach to Improving Clinical Success. J
Molecular Similarity. John Wiley and Sons, New York, USA.
Med Chem 52: 6752-6756.
24 Pipeline Pilot, Accelrys Software Inc. (2013) BIOVIA Pipeline Pilot.
7 Welsch ME, Snyder SA, Stockwell BR (2010) Privileged scaffolds for
library design and drug discovery. Curr Opin Chem Biol 14: 347-361. 25 Gaulton G, Bellis LJ, Bento AP, Chambers J, Davies M, et al. (2012)
ChEMBL: a large-scale bioactivity database for drug discovery.
8 Khanna K (2012) Drug discovery in pharmaceutical industry:
Nucleic Acids Res 40: D1100-D1107.
productivity challenges and trends. Drug Discov Today 17: 1088-
1102. 26 Baber JC, Shirley WA, Gao Y, Feher M (2006) The Use of Consensus
Scoring in Ligand-Based Virtual Screening. J Chem Inf Model 46:
9 Azzaoui K, Hamon J, Faller B, Whitebread S, Jacoby E, et al. (2007)
277-288.
Modeling promiscuity based on in vitro safety pharmacology
profiling data. Chem Med Chem 2: 874-880. 27 Whittle M, Gillet VJ, Willett P, Loesel J (2006) Analysis of Data Fusion
Methods in Virtual Screening:  Theoretical Model. J Chem Inf Model
10 Huggins DJ, Sherman W, Tidor B (2012) Rational Approaches to
46: 2193-2205.
Improving Selectivity in Drug Design. J Med Chem 55: 1424-1444.
28 Rogers Hahn M (2010) Extended-connectivity fingerprints. J Chem
11 Ashburn TT, Thor KB (2004) Drug repositioning: identifying and
Inf Model 50: 742-754.
developing new uses for existing drugs. Nat Rev Drug Discov 3: 673-683.
29 Cramer RD, Jilek RJ, Andrews KM (2002) Topomer similarity searching
12 Liu Z, Fang H, Reagan K, Xu X, Mendrick DL (2013) In silico drug
of conventional structure databases. J Mol Graph Model 20: 447-462.
repositioning: what we need to know. Drug Discov Today 18: 110-115.
30 Rarey M, Dixon JS (1998) Feature trees: A new molecular similarity
13 Ekins S, Williams AJ, Krasowski MD, Freundlich JS (2011) In silico
measure based on tree matching. J Comput Aided Mol Des 12: 471-490.
repositioning of approved drugs for rare and neglected diseases.
Drug Discov Today 16: 298-310. 31 Fraczkiewicz R, Lobell M, Göller AH, Krenz U, Schoenneis R, et al.
(2015) Best of both worlds: combining pharma data and state of the
14 Roth BL, Sheffler DJ, Kroeze WK (2004) Magic shotguns versus magic
art modeling technology to improve in Silico pKa prediction. J Chem
bullets: selectively non-selective drugs for mood disorders and
Inf Model 55: 389-397.
schizophrenia. Nat Rev Drug Discov 3: 353-359.
32 ADMET Predictor (2014) Simulations Plus, Lancaster, CA, 93534, USA.
15 Medina-Franco JL, Giulianotti MA, Welmaker GS, Houghten RA
(2013) Shifting from the single to the multitarget paradigm in drug 33 Flipo M, Desroses M, Lecat-Guillet N, Villemagne B, Blondiaux N, et al.
discovery. Drug Discov Today 18: 495-501. (2012) Ethionamide boosters. 2. Combining bioisosteric replacement
and structure-based drug design to solve pharmacokinetic issues in
16 Bottegoni G, Favia AD, Recanatini M, Cavalli A (2012) The role of
a series of potent 1,2,4-oxadiazole EthR inhibitors. J Med Chem 55:
fragment-based and computational methods in polypharmacology.
68-83.
Drug Discov Today 17: 23-34.

© Under License of Creative Commons Attribution 3.0 License 15


2018
Chemical informatics
ISSN 2470-6973 Vol. 4 No. 1:1

34 UniProt (2016) Available from: http://www.uniprot.org (Accessed 37 Doré S, Okrasa K, Patel JC, Serrano-Vega M, Bennett K, et al. (2014)
on: January 05, 2018). Structure of class C GPCR metabotropic glutamate receptor 5
transmembrane domain. Nature 511: 557-562.
35 PDB Bind (2014) Available from: http://www.pdbbind-cn.org
(Accessed on: January 05, 2018). 38 Wang WH, Katritch V, Han GW, Huang XP, Liu W (2013) Structure of
the human smoothened receptor bound to an antitumour agent.
36 Halgren TA (2009) Identifying and characterizing binding sites and
Nature 497: 338-343.
assessing druggability. J Chem Inf Model 49: 377-389.

16 This article is available in: http://cheminformatics.imedpub.com/

Você também pode gostar