[Frontiers in Bioscience 16, 2289-2306, June 1, 2011]

Protein-Ligand docking

Giovanni Bottegoni

Department of Drug Discovery and Development, Istituto Italiano di Tecnologia, via Morego n.30 Genova, 16163, Italy

TABLE OF CONTENTS

1. Abstract
2. Introduction
3. Protein-ligand docking flowchart
3.1. Receptor structure selection
3.2. Binding pocket representation
3.3. Binding pocket composition
3.4. Ligand conformational search
3.4.1. Deterministic algorithms
3.4.2. Stochastic alogorithm
3.4.3. Simulation methods
3.5. Receptor flexibility
3.5.1. Indirect methods
3.5.2. Local variants generation
3.5.3. Multiple receptor conformations docking
3.6. Scoring
3.6.1. Force-field-based scoring functions
3.6.2. Empirical scoring functions
3.6.3. Knowledge-based scoring functions
3.6.4. Consensus scoring
4. Compare docking protocols
5. Future perspectives
6. Conclusions
7. References

1. ABSTRACT

Ligand-docking is an established computational technique universally applied in structure-based drug design. Since the first attempts carried out in the early '80s to predict the three-dimensional conformation of a protein-ligand bound complex, this methodology has evolved constantly and it is presently implemented in many different ways. The present study aims at explaining the standard protein-ligand docking protocol, together with its main advantages and drawbacks. Milestone reports and future directions are reported and discussed as well.

2. INTRODUCTION

Since the beginning of the 20th Century, recognition at the molecular level has been considered a fundamental step in all biologically relevant processes. Emil Fisher was the first to describe enzyme-substrate interactions using the 'lock and key' metaphor (1). A few years later, Paul Erhlich went further, stating that "corpora non agunt nisi fixata" (drugs will not work unless bound). Erhlich was the first to openly challenge the idea that "corpora non agunt nisi soluta" (drugs will not work unless in solution), which dated back to the Middle Ages (2). Since then, the cornerstone of modern medicinal chemistry has been the assumption that complementarity between a small organic molecule and its biological counterpart could explain the potency and the specificity of a drug. However, for many years, knowledge of molecular interactions was limited, difficult to exploit, and played only a marginal role in the quest for new candidates. Drug discovery projects focused on lead compounds selected as result of random screenings or because of their striking resemblance to natural binders. Synthetic campaigns were carried out according to a simple protocol: i) a series of compounds was synthesized introducing a variety of substituents and decorations on a lead; ii) the obtained compounds were tested to estimate a measure of activity, iii) analyzing the experimental results, some simple structure-activity relationships could be gathered, and iv) used to guide the next round of synthesis. The cycle was iterated several times until a promising drug candidate was isolated or the entire project was dropped because of a lack of interesting results. This strategy, which relied largely on synthetic preferences and chemical intuition, was very inefficient and almost impossible to optimize further. A rational approach to drug design began in the 1970s, when: i) advances in molecular biology moved the focus of early experimental tests from cell lines and animals to purified proteins, ii) workstations of unprecedented computational power and storage capability came onto the market, and iii) an ever-increasing number of experimentally solved high resolution protein structures became publicly available (3). Since then, Computer-Assisted Drug Design (CADD) has become a key part of almost every drug discovery program. This is mainly because it is fairly accurate, constantly improving, and, at the same time, faster and much cheaper than in vitro experimental setups like combinatorial synthesis and high-throughput screening (4). CADD can be divided into two main branches, depending on whether the coordinates of the receptor are available or not (5). In the latter case, the so-called Ligand-Based Drug Design (LBDD) methods build predictive models by analyzing the chemical and pharmacological features of molecules of known activity. Putative drug candidates that, according to the model, fit the proposed profile of activity are retrieved by mining databases of compounds or synthesizing new molecules from scratch (6-7). When the receptor's three-dimensional (3D) structure is available and can be used to predict ligand-receptor interactions, a Structure-Based Drug Design (SBDD) approach becomes feasible. The general idea behind this approach is that all the information necessary for building a tightly interacting ligand is already contained in the 3D structure of the target. SBDD methods can be further classified into three different groups: manual structure matching, de novo design, and molecular docking. The first group is only interesting from a historical perspective: those early attempts were mostly based on interactive exercises carried out on graphical workstations. Despite a limited number of successful applications, they were considered too time-consuming and too dependent on the user's instinct to be of truly practical use (8). De novo design methods are based on the assumption that novel highly potent molecules can be produced by growing them directly in the receptor binding site. Molecular fragments are first positioned independently and then joined to form bigger molecules according to the principle of local optimization. Implementations of this basic strategy differ in the way the building fragments are defined and linked, in the way the fitness of the created molecules is evaluated, and in the strategy adopted to efficiently browse the chemical space to avoid a combinatorial explosion. De novo design methods have been thoroughly discussed and compared in several recent reviews, to which the interested reader is referred for further details (9-10). The third SBDD approach is molecular docking, which attempts to predict the structure of the intermolecular complex of a given ligand at the receptor binding site by generating and evaluating several conformational variants.

The first automatic docking algorithm was reported in 1982 by Kuntz and coworkers (11). In that pioneer implementation, the docking problem was simply addressed in terms of shape-matching between rigid bodies with no energy evaluation involved. Since then, many other docking programs have been published to provide more accurate predictions of the bound complexes. The representation of the molecules improved, taking full advantage of the ever-increasing computational power available. The typical docking protocol evolved from rigid bodies simulations to include full ligand flexibility and, more recently, partial receptor plasticity. Another fundamental advance was the introduction of scoring schemes that went beyond simple shape complementarity to rank the solutions. Some important milestones in the field are summarized in Table 1.

It is important to note that the general definition of molecular docking covers a heterogeneous ensemble of approaches that vary significantly depending on the chemical and biochemical nature of the ligands and receptors. Researchers have used molecular docking methods to predict complexes formed between proteins (12-13), proteins and nucleic acids (14), nucleic acids and small molecules (15). In this chapter, I will only discuss docking protocols used to predict the binding mode of a drug-like compound at the binding site of a protein. This is because of their relevance in SBDD. The chapter is organized around the general outline of a typical docking exercise. A detailed description of several conformational-searching approaches and scoring schemes used in well-known docking programs will be provided, highlighting their strengths, weaknesses, and open issues. Finally, I will discuss the future directions of the technique together with applications reported in the literature.

3. Protein-Ligand Docking Flowchart

The basic idea of predicting the bound pose of a small organic ligand at the target binding site has been used in many different ways. However, we can sketch a common outline of the procedure (see Figure 1). In docking, several problems are addressed sequentially, with each step introducing a new layer of complexity (16-17).

3.1. Receptor structure selection

The predicting power of a docking procedure depends strongly on the quality of the receptor model which, in turn, is affected by the accuracy of the atomic coordinates. Information about a protein's 3D shape comes mainly from experimental structures solved by X-ray diffraction or NMR spectroscopy (18). Currently, over 63,000 X-ray structures are publicly available in the Protein Data Bank (PDB) (19). Crystallographic coordinates within the threshold of 2.5 Å of nominal resolution are usually considered very faithful representations of protein conformations. However, this assumption is not entirely safe, since even high resolution structures can have specific regions where the atomic fit into the electron density map is rather poor. For this reason, the choice of the receptor structure should not be based solely on the resolution but complemented with other metrics such as the Rfree, the diffraction-component precision index, and the B-factors (20-22). NMR spectroscopy returns low resolution structures and can only be applied to comparatively small proteins. However, NMR spectroscopy, working in solution, can provide a more natural model of the receptor's native state. In some specific cases, NMR conformers can be used as an intuitive representation of protein flexibility (23-24). When an experimentally solved structure is not available, the receptor can be obtained by comparative modeling if sufficient sequence similarity exists (25-26). Successful docking experiments on homology models have been carried out on many different proteins including, but not limited to, protein kinases, hormone receptors, and G-protein coupled receptors (27-29).

3.2. Binding pocket representation

There are three different ways to translate atomic coordinates into a representation of the receptor (30). The most intuitive approach is to express the system in a fully atomistic fashion that explicitly accounts for all the atoms of the ligand and the exposed atoms of the receptor. This approach strongly relies on molecular mechanics force fields to describe atomic radii and charges. Despite being very accurate, an all-atom system is very computationally demanding. This is because the number of interactions to be calculated scales as O(N2), where N is the total number of atoms. At present, all-atom representations are only used during the final rescoring steps to increase the overall accuracy of the procedure. Thanks to the work of Lee, Richards, and Connolly (31-32), a system can also be represented in terms of interacting molecular surfaces. The solvent-excluded surface is obtained by rolling a spherical probe, which represents a water molecule, on the exposed atoms and then merging the regions of the van der Waals spheres that come in contact with the probe. When flexibility is involved, surface-matching approaches are considered quite impractical and they have been almost completely abandoned except in rigid protein-protein docking. For a detailed description of Connolly surface implementation in ligand docking, see the review by Halperin and colleagues (33). The method most commonly used to describe the receptor is through a set of pre-computed potential grids, according to the methodology outlined by Goodford in 1985 (34). Storing pre-calculated potential energies arising from interactions between a chemical probe and the receptor, these regularly spaced 3D lattices allow a rapid evaluation of ligand-bound conformations. A basic receptor description can be obtained with just two lattices accounting for van der Waals and electrostatic potentials. However, depending on the specific implementation, contributions from other probes can be mapped as well. Three examples of receptor representations are reported in Figure 2.

3.3. Binding pocket composition

A standard docking procedure does not attempt to consider the whole receptor molecule but rather focuses on a very specific region, the so-called ligand-binding pocket. The binding region has to be defined in terms of both shape and composition. The most straightforward way to define the pocket shape is to select the region immediately surrounding a known ligand co-crystallized in complex with the receptor. Several algorithms have been reported that attempt to predict the pocket location if no holo structure is available, or if the aim of the study is to explore new (e.g. allosteric) spots. The predictive approaches are based on one of the following methods (or a combination of several): analysis of the protein surface and structure, energy profiling, prior knowledge of the substrate, or evolutionary conservation and sequence analysis (35). On average, the predictions are fairly accurate and they all agree that the largest detectable cavity usually corresponds to the binding site. The pocket definition greatly affects the quality of the docking results. If the pocket is too small or shifted from the real location, the accuracy of the docking prediction will be rather poor. If the pocket is too large, the success rate decreases according to its size (21-36). Once the boundaries of the pocket are established, it is important to define the binding site's composition. Usually, in protein crystal structures, the coordinates of hydrogen atoms cannot be solved. They are added to receptor models by purposely developed routines; the polar hydrogen atoms' orientation and the hystidines' tautomerization states should be optimized to reflect the best hydrogen-bonding pattern. Furthermore, other elements should undergo energy optimization, including amidic groups of glutamine and asparagine side chains whose exact orientation is quite hard to gather from diffraction data, side chains from regions that poorly fit into the electron density map, and atoms with high B-factors. Cofactors and metal ions are considered to be part of the receptor cavity and should be included in the definition of the pocket. The role of water molecules is more controversial: some authors suggest that there is no real need for explicit water molecules since their presence can be approximated by cavities in a high distance-dependent dielectric constant (37). Other authors recently reported significant improvements in the quality of results if explicit water molecules are included in the binding site (38-39). A good rule of thumb is to include in the site definition only water molecules that bridge the receptor and a co-crystallized ligand, establishing specific interactions with two non-water molecules.

3.4. Ligand conformational search

During the ligand conformational step, a searching algorithm generates a set of conformational variants of the ligand at the receptor-binding site. The earliest implementations considered the ligand as a rigid body and only sampled its roto-translational degrees of freedom (8). This approach had a limited application because the conformation that a ligand adopts when bound at the binding site, despite being generally quite close to it, does not always correspond to any energy minima sampled in the solvent (40). However, if ligand flexibility is considered, an accurate sampling of the conformational space quickly becomes too computationally demanding (41). This is because the number of possible conformations scales to the power of the number of rotatable bonds. For this reason, flexible ligand docking protocols adopt different strategies to reduce the exponential dependency of the computational time on the size of the system. Sampling techniques are usually grouped into three main categories: deterministic, stochastic, and simulative methods (16-30-42). Herein follows a detailed discussion of a selection of historically and educationally relevant algorithms. The reader interested in a comprehensive list of reported docking protocols is referred to the meticulous research of Moitessier and colleagues (43).

3.4.1. Deterministic algorithms

In a deterministic approach, the conformational sampling follows a series of steps that will always lead to identical results, if starting from the same state of the system. An exhaustive systematic search is the most basic and intuitive form of deterministic algorithm but, as previously explained, it faces the problem of a combinatorial explosion even when dealing with systems of relatively small size. Deterministic algorithms use heuristics and termination criteria to reduce the size of the conformational space. For example, in the incremental construction algorithm, a ligand is docked at the receptor-binding site in three steps: i) the ligand is divided into a rigid core and flexible fragments, ii) the rigid core is docked at the binding site, and iii) the reconstruction is completed sequentially by adding flexible parts. A well-known incremental construction method is the 'anchor and grow' searching strategy used in DOCK since version 4.0 (44). The ligand is split into fragments concentrically arranged in layers around a rigid anchor; each fragment corresponds to the atoms affected by the torsion of a rotatable bond. First, the anchor is docked using a geometric matching approach. Then, a layer of fragments is added, exploring the associated torsions, optimizing the generated partial poses, and pruning the less energetically favorable conformations. The reconstruction iterates expansion, optimization, and pruning steps for every layer. In order to escape local minima, the pruning strategy is tuned to preserve the diversity of the poses. This strategy is reported to be both accurate and computationally efficient. FlexX uses another incremental construction protocol (45). The rigid core, called the base fragment, is placed at the binding site, evaluating chemical interactions such as hydrogen bonds, salt bridges, and, partly, hydrophobic contributions. The flexible parts of the ligand are added, exploring several preferential values for each torsional angle. Structures that present internal clashes or overlaps with the receptor are eliminated while the remaining poses are subjected to a complete linkage hierarchical-clustering process to eliminate redundancy. The best solutions from each cluster are used to iterate the procedure. In Hammerhead (41). ligand fragments are docked into the binding site and those achieving the highest scores are used as 'heads' to guide the positioning of the rest of the molecule (the 'tail'). Newly generated poses are optimized by energy minimization. Recently, the original strategy used in Hammerhead has been revised, expanded, and included in Surflex (46-47).

3.4.2. Stochastic algorithms

Stochastic algorithms address the ligand conformational sampling as an optimization problem, introducing probabilistic elements like random perturbations on selected parameters (48). Stochastic algorithms for docking can be divided into two groups: genetic algorithms and Monte Carlo implementations.

Genetic algorithms (GA) attempt to find the ligand pose that best fits at the receptor binding site by borrowing strategies (and vocabulary) from evolutionary biology and population dynamics. An initial population of conformations is created randomly, encoding the variables representing each degree of freedom in data structures called chromosomes. Each individual is evaluated according to its fitness for an objective function: a larger fitness corresponds to a greater chance of transmitting its genetic inheritance to the next generation. The better fitting offspring replace the least fit members of the previous generation. To avoid a premature convergence that might trap the system in a local minimum, variations at the chromosome level are introduced randomly in the population by genetic operators such as mutation and crossing-over (49). The Lamarckian GA used in AutoDock is a global evolutionary optimizer equipped with two-point crossing-over and point mutation operators along with a local search feature (50). Before reproduction, each individual undergoes an energy minimization step. Changes introduced locally by minimization are coded back into the chromosome and transmitted to the next generations. This GA was nicknamed 'Lamarckian' after the French biologist Jean Baptiste de Lamarck who introduced the idea (now replaced by Mendelian genetics) that inheritance of acquired traits improves the adaptation of a species to its habitat. GOLD is another docking software whose sampling engine relies on a GA (51). In this case, the evolutionary process does not take place in a single large population. GOLD simulates a distributed environment where multiple subpopulations are handled simultaneously. This scheme, known as the 'Islands model', assumes that each population breeds and evolves separately. However, individual exchanges from one island to another do happen. Migration, a third genetic operator that complements mutation and crossing-over, controls the exchange rate. Population diversity is also preserved by the concept of 'nicheing': two or more individuals share the same niche if the distances between their chromosomes lie within a given threshold. When new individuals join a population, either by breeding or by migration, they replace the least fit individual in their niche rather than in the entire population. The success rate of GAs strongly depends on the quality of the fitting function and the fine-tuning of several parameters (the size of the initial population, crossing-over and mutation rates, number of generations, etc.). If the overall setup creates an adequate evolutionary pressure, successive generations will likely provide at least one individual which represents an optimal bound conformation.

Monte Carlo (MC) implementations apply to the docking problem the general idea of importance sampling, namely the Monte-Carlo-based algorithm conceived by Metropolis and colleagues (52). A random conformation of the ligand is docked at the binding site and, after minimization, the energy score is evaluated. Then, a random change in one or more variables is introduced. The new conformation is minimized and scored again. If the estimated energy is lower than the previous one, the new pose is automatically accepted. If the energy is higher, the Metropolis criterion, a probabilistic test based on a temperature-dependent exponential Boltzmann function, is applied:

Eq.1

where D E is the energy difference, κB is the Boltzmann constant, and T is the temperature of the system. If a randomly generated number between 0 and 1 is lower than the Boltzmann factor, the test is passed and the new conformation is accepted. Otherwise, the new conformation is rejected. This process is iterated until the requested number of cycles is performed. The Internal Coordinates Mechanics (ICM) (53-54) adopts a biased probability stochastic optimizer as a docking engine. The Cartesian coordinates of the system are translated in internal coordinates. Roto-translational variables are sampled, simulating a pseudo-Brownian motion. The sampling of the torsional degrees of freedom is biased toward high probability regions according to a Gaussian distribution. A history term keeps track of the visited regions of the conformational space to help the system escape minima already explored, driving it toward new ones. Before acceptance or rejection according to the Metropolis criterion, a Coniugate-gradient minimization is applied to the generated poses. Glide (Grid-based LIgand Docking with Energetics)(55) is based on a funnel-shaped docking strategy that combines elements from database filtering, systematic search and incremental construction. The rigid core of a ligand is docked at the receptor binding site where the pre-calculated conformations of each rotatable group are evaluated. After a grid-based energy evaluation, MC plays an important role in further optimizing the top scoring conformations to properly orient the more flexible parts. Other docking tools that use MC searches include ProDock (56) and MCDock (57).

3.4.3. Simulation methods

Molecular Dynamics (MD) simulations have been used as docking tools to only a limited extent. In the docking framework, the main limitations of MD are the inability to cross high energy barriers and the amount of calculation time required to perform the simulations (30). Moreover, the final quality of the results is strongly affected by the initial conformation of the system (58). Different solutions have been proposed to more accurately and efficiently explore the energy surface. In MD Docking, receptor, ligand, and solvent are treated at different temperatures, coupling separate regions of the system with different thermal baths (59). Other authors have tried multicanonical MD simulations where the sampling is performed in an artificially flat energy distribution (60). Enhanced sampling methods have also been applied to the docking problem. For example, the protocol proposed by Gervasio et al., the so-called metadynamics protocol, explores the properties of multidimensional free energy surfaces of complex many-body systems using coarse-grained non-Markovian dynamics in the space defined by a few collective coordinates (61). A history-dependent potential term fills the minima in the free energy surfaces, allowing the efficient exploration and accurate determination of the FES as a function of the collective coordinates. Metadynamics is able not only to reproduce a docked pose but also to mimic a ligand exiting or entering a target active site (62).

3.5 .Receptor flexibility

The first model used to explain protein-ligand binding described the event as an interaction between two rigid bodies. The 'lock and key' idea was then replaced by the Induced Fit theory proposed by Koshland: after binding, the ligand modifies the binding pocket to increase its fitness (63). In other words, after binding, the receptor is forced to adopt a conformation which would not exist without the ligand. The Induced Fit paradigm was, in turn, recently superseded by the Conformational Ensemble model (64-65). In this view, proteins naturally exist as an ensemble of interconverting states. The native conformation of a protein is actually an average state resulting from a thermodynamic equilibrium of conformers. When a ligand preferentially binds and stabilizes a receptor variant far from the native state, it triggers a population shift. What is perceived as a local rearrangement of the binding pocket is actually a change in the thermodynamic equilibrium of the whole system. Until recently, almost every docking simulation froze the receptor conformation. Now, a greater understanding of protein-ligand-binding dynamics has led to a gradual introduction of the receptor degrees of freedom in standard docking procedures. The biased strategy used to validate docking protocols probably helped minimize the role of receptor flexibility: when a new tool was proposed, it was usually tested using a re-docking exercise carried out on a set of co-crystals. In re-docking, a ligand is extracted from a holo structure and docked back in the cognate binding pocket. Since the receptor structure is perfectly adapted to accommodate the ligand, the results were quite accurate but, in retrospect, definitely inflated. The importance of protein flexibility became clear when ligands were docked at non-native binding sites (cross docking). In this more realistic representation of a real life scenario, the accuracy of standard docking programs dropped from over 90% to less than 50% (36-66). Figure 3 reports an example of how receptor flexibility can affect a cross docking attempt. Ideally, since even small changes in the binding pocket can considerably affect the final results, receptor and ligand should be sampled simultaneously in a global energy optimization attempt. This simulation has been described as 'solving the problem of protein folding by adding a ligand'. Because of the very high number of degrees of freedom involved, it would be exceedingly long and would deal with a free energy surface so rugged that convergent results would probably not be obtained (67). The pragmatic strategies adopted to study receptor plasticity can be divided into three main groups: indirect methods, local-variants-based protocols, and multiple receptor conformations docking (MRC).

3.5.1. Indirect methods

Receptor flexibility is accounted for implicitly, allowing a partial overlap between receptor and ligand atoms, and smoothing the high energetic penalties thus generated. The idea of a 'soft' docking algorithm was first proposed by Jiang and Kim (68) and then exploited in several other accounts (69). Soft docking is a computationally efficient and straightforward way of implementing receptor plasticity if the rearrangements to be modelled are local and small.

3.5.2. Local variants generation

The second group includes those methods that, during ligand sampling, introduce a concurrent exploration of some local degrees of freedom of the receptor. These strategies usually deal with torsional angles, which are less likely to dramatically alter the energy profile of the receptor, rather than with planar geometry variables. The torsional search has been reported to be more efficient and prone to converge if carried out in internal coordinates rather than in Cartesian space (70). Local searches can be limited to hydrogen atoms and lone pairs (51) or extended to side chains (71). In the latter case, the side chain flexibility is not modelled continuously but by means of rotamer galleries where the most energetically stable conformers of each amino acid are collected. The discrete nature of these libraries is a good compromise between computational efficiency and accurate results. Several authors have used a two-stage setup where rotamer evaluation coupled with tabu search techniques are followed by a local energy optimization to allow torsional values not originally included in the libraries (72-73). Meiler and Baker adapted the ROSETTADOCK(74) protein-protein docking algorithm to ligand docking (75). Again, side chains are explicitly sampled during a Monte Carlo optimization of the ligand. The novelty here is that the collected rotamer libraries are affected by the conformation of the protein backbone.

3.5.3. Multiple receptor conformations docking

Domain motions, extended loop transitions, and all the rearrangements at the backbone level are far beyond the capabilities of methods that model the binding pocket flexibility on the fly. When direct modelling of local conformational variants is not enough, a multiple receptor conformations (MRC) docking strategy can be attempted. In its most basic form, MRC is just a standard docking approach systematically iterated over an ensemble of receptor conformations. Each conformation is used in an independent simulation and the results are merged together during an additional post-processing step. Members of the ensemble can be collected from experimental structures, generated by computational means, or both (18). Experimental holo structures can be considered reliable representations of those receptor conformational space regions that promote the binding event. Conversely, in-silico-generated conformations can produce unprecedented rearrangements of the binding pocket and, therefore, enhance the possibility of discovering truly novel ligands. Barril and Morley's groundbreaking study of MRC docking (76) suggests that, although using a limited ensemble of selected conformations generally improves the quality of results, an indiscriminate inclusion of a large number of receptor variants in the simulation does not improve the overall performance and may actually be deleterious. In this regard, several strategies have been recommended to select, in advance, a subset of conformations that will most likely provide the best results when combined in an MRC protocol. MRC calculations are time-consuming, since the calculation requirements scale linearly with the number of structures, and they entail a high level of user intervention during post-processing. In order to overcome these two main drawbacks, automatic MRC approaches have been reported, mainly as extensions and adaptations of standard docking engines. In 1997, Knegtel and coworkers (77) proposed an MRC protocol based on DOCK3.5 (78). Separate complements of grids, each one describing a single receptor conformation, were merged into an average model. Huang and Zou (79) describe another MRC approach developed by adapting DOCK (version 4.0) (80). The weighted average method was also applied to several customized versions of AutoDock (81). In particular, the idea of average grids was further improved by introducing Boltzmann weights based on energetic differences (66). In FlexE (82-83), which was developed starting from FlexX, a united protein description is provided: after superimposition, the regions of the receptor conformers that display structural variations are combinatorially merged to generate new states which, in turn, are later used alongside the original structures. MRC studies using Glide (84) and ICM (85-86) were also reported. In particular, the Four-Dimensional docking algorithm considers multiple receptor conformers as an extra dimension of the ligand search space. The procedure combines the accuracy of an MRC implementation with the speed of a single conformer docking. Recently, several algorithms purposely devised for MRC calculations were reported. FITTED (87) is based on a genetic algorithm whose operators can describe receptor flexibility either jumping among different protein conformations from the set (semi-flexible run) or rearranging side chains and backbone variables independently. FLIPDock (88) simultaneously codes ligand and receptor motions in a high level data structure, the Flexibility Tree, originally developed to describe conformational subspaces of macromolecules and later adapted to the docking problem. One of the most interesting features of the Flexibility Tree is that experimental evidence and biological expertise can be included quite easily in the flowchart. In gapped models (36-89), an ensemble of receptor variants is generated by deleting different parts of the receptor, typically converting one or more binding pocket side chains into alanine. Empty spaces allow an initial positioning of the ligand, avoiding severe steric clashes. The receptor is then returned to the ungapped state and the complex undergoes geometrical optimization before rescoring. Gapped models combine features from MRC and local variant optimization techniques.

The Relaxed Complex Method (RCM) developed by McCammon's group is a good example of how an advanced MRC protocol can contribute to the success of rational drug design efforts (90). In RCM, MRC docking is complemented with other computational techniques in an advanced protocol that provides a reliable prediction of a ligand-binding mode. First, the receptor flexibility is explored in a fully atomistic fashion using long MD simulations. Plain MD or enhanced sampling strategies can be used to sample the receptor conformational space. Several snapshots are extracted from the trajectory and used as receptor conformers. In the first applications of RCS, snapshots were extracted at equal time intervals while later implementations strongly rely on advanced cluster analysis algorithms to eliminate conformational redundancy and to reduce the computational burden (91). The docking step is accomplished using AutoDock Lamarckian GA, which takes full advantage of the improved desolvation term introduced since version 4.0 (92). The most promising poses are rescored with a customized implementation of MM-PBSA, an end-point free energy assessment approach (93-94). In standard MM-PBSA, an MD simulation of the bound complex is performed to calculate, according to molecular mechanics (MM), the contribution of ligand receptor direct interactions. The solvation energy is decomposed in electrostatic and non-polar components: the electrostatic contribution is retrieved by solving the Poisson Boltzmann (PB) equation in a continuum solvent model, while the non-polar effect is estimated according to the surface area (SA) accessible to the solvent (95). In RCS, MM-PBSA is modified to include the unfavorable entropic contribution to the binding event due to the loss of roto-translational and conformational entropy (96). The reference states for the unbound protein and ligand are extracted from the docked complex trajectory: what may appear as an oversimplification significantly reduces computational requests, decreases convergence issues, and introduces only negligible variations in the final outcome. RCS has successfully helped in the search for novel inhibitors of HIV Integrase (97), Kinetoplastid RNA Editing Ligase 1 (KREL1) of T.brucei (91), AChBP (98), and MMP-2 (99).

3.6. Scoring

In the last stage of a docking protocol, the poses retrieved during sampling need to be evaluated in terms of interaction energy with the receptor. The quantitative estimate of the binding affinity is usually reported in terms of Gibbs free energy difference (ΔGbind) between receptor (R) and ligand (L) in their unbound state and the complex (RL) formed upon binding. From the statistical thermodynamics point of view, the theoretical frame for evaluating the free energy of binding is well established (100-101). Receptor ligand associations are usually regarded as an event that combines enthalpic and entropic effects. There is an electrostatic component that accounts for basic interactions, such as the H-bond formation and the Coulombic attraction/repulsion among charges, as well as for superior order contributions such as dipole-dipole interactions. Shape complementarity is also accounted for by van der Waals interactions. In physiological conditions, the binding event takes place in solvent and, for this reason, the contributions of hydrophobic surfaces solvation and desolvation must be considered too. When bound, receptor and ligand can only adopt a narrower range of conformations as compared to the unbound states, intuitively decreasing the entropy of the system. Finally, the ligand could adopt strained conformations, directly increasing the system's potential energy. All these contributions are expressed by ΔGbind which, in turn, is related to the equilibrium binding constant Keq according to the following equation:

Eq.2

where T is the temperature of the system.

ΔGbind can be determined from first principles only for very simple systems (such as an ideal gas), which allow the solution of the system configuration integral. In real systems, the computational determination of ΔGbind can be carried out at different levels of approximation, with more accurate methods also being more demanding in terms of CPU time. A simple classification can be attempted according to the number of states that the system considers for the calculation (102). In path methods, the free energy difference is calculated by considering the initial and the final states together with several unphysical intermediates. The path in the energy surface connecting the unbound to the bound state is exceedingly long to calculate and the simulation would hardly converge. However, if the overall path is split into smaller steps, as in the free energy perturbation (103) or the thermodynamic integration (104) techniques, the local energy differences can be more practically calculated and total ΔGbind can be obtained by summation. Other path methods currently used in binding free energy estimation are computational alchemy (105) and metadynamics (106). These methods are very accurate, providing ΔGbind in the range of accuracy of the experimental error (1-2 kcal/mol), but their computational cost hampers their use in standard docking protocols. In the end-points methods, only the initial and the final states of the system are considered (93-107). The practical issue that has to be addressed here is that, in explicit solvent models, the free energy difference due to complex formation represents just a small fraction of the global energy difference between the two states, overwhelmed by solvent contributions. For this reason, many end-points methods resort to implicit solvent descriptions to highlight solute contributions to ΔGbind. These strategies are much faster than path-based methods but still provide very accurate predictions. Several end-points methods have been successfully applied in docking protocols and one example will be discussed in greater detail later in this chapter.

In the vast majority of docking approaches, ΔGbind is estimated using a simple scoring function that only considers the bound state (108). It has been noted that, for strong binders, this assumption can still provide an adequate description of the system because the bound conformation provides the main contribution to the partition function. Scoring functions are not expected to provide an accurate estimate of the binding free energy, but to characterize those igand poses that more accurately resemble experimental binding modes. To date, over 35 different scoring functions have been reported (43). They introduce different approximations and linearly combine different terms (which are generally assumed to be independent terms) but they all share the main feature of being fast: they can provide an almost instantaneous, if somewhat rough, estimation of the binding free energy. Scoring functions are also used during the sampling step, whenever an energy evaluation is necessary (application of the Metropolis criterion, estimation of the fitness function in GA, comparative conformers evaluation in incremental construction, etc.). So, technically, the final scoring step should be more properly referred to as re-scoring, since contributions to the free energy of binding ignored during sampling for the sake of speed can be included at this stage or an altogether different scoring function can be used. Several authors have classified the scoring functions into three main categories: force-field-based, empirical, and knowledge-based (16,30,109).

3.6.1. Force-field-based scoring functions

The bonded and non-bonded interactions among the atoms in the system are modeled according to the rules of molecular mechanics. A master equation provides the overall energy of the system, expressing different contributions with additive terms. The terms expressing the energy strain that the ligand displays in its bound conformation are described by a harmonic potential: the energy contribution of covalent bonds stretching, valence bonds bending, and torsions varies according to the deviation from a reference value. A Coulomb electrostatic potential describes the interactions between charges, while van der Waals attractive and repulsive energies are expressed by the Lennard-Jones potential. Equilibrium states for these terms are derived from force fields originally developed for molecular dynamics calculations. Since functional forms differ in just minor details among different force fields, only the equation of the AMBER (110) force field is reported as a general example:

Eq.3

Due to the nature of the resulting energy landscape, a minimization step is required before the final energy evaluation (109). The main limitation of force-field-based scoring functions is that contributions to binding such as desolvation effect and configurational entropy loss are either completely overlooked or introduced in the final score by heuristics. Several authors suggest that the accuracy of force-field-based scoring functions can be increased by tuning the Lennard Jones potential with different exponents (111). In fact, in its standard 6-12 implementation, this term is extremely sensitive to even small deviations in atomic coordinates and can produce a large amount of noise in intermolecular energy calculations. Another way to improve accuracy is to work on the electrostatic potential. Force fields that explicitly account for polarization effects on the atomic charges as well as distant-dependent dielectric constants to model solvation have recently been adopted in docking protocols (112), D-Score (111), G-Score (111), and the scoring function used in AutoDock (92) are all examples of force-field-based scoring functions.

3.6.2. Empirical scoring functions

Empirical scoring functions are apparently similar to force-field-based implementations as they are built on the idea that a binding score can be described through a linear summation of independent terms. Empirical terms may vary among scoring functions but they are usually simpler than their force field counterparts, can be calculated very easily, and are statistically weighted before summation. The weights are determined by regression analysis from a training set of known ligand receptor co-crystals whose binding free energy was experimentally calculated. What is not clearly established and is usually regarded as the main limitation of this kind of scoring method is the efficacy of their predictive power when confronted with putative ligand receptor complexes that are radically different from those used in the training set. Several empirical scoring functions have been reported, some developed and used in tight combination with a specific docking software (GlideScore in Glide (55), or the scoring functions used in ICM (113) and FlexX (45)), other provided as standalone routines, like ChemScore (114), HINT (115), and VALIDATE (116). A very appealing feature of empirical schemes is the possibility of devising customized weights in order to tailor the function toward a specific protein class. F-Score (45), one of the scoring functions endowed in FlexX and largely based on the work of Böhm (117), represents a standard example of the empirical approach. The binding score is calculated by combining simple terms that account for hydrogen bonds (hbond), ionic interactions (ionic), the number of torsional angles (Nrot), and protein-ligand lipophilic contacts (lipo).

Eq.4

The functional form f(ΔR,Δα)of each term discourages deviations from ideal geometries. With respect to the previous work (118), the main novelty introduced in F-Score is the calculation of lipophilic interactions as a sum of all pairwise interatomic contacts. Moreover, the scheme introduces a specific term for aromatic contributions (aro). The K weights were calibrated by regression using a set of 45 experimentally determined binding scores from protein-ligand co-crystals.

3.6.3. Knowledge-based Scoring Functions

A third strategy that can be adopted to provide a quantitative assessment of the binding energy is related to the idea of Potential of Mean Force (PMF) (119). The PMF formalism was originally developed for liquids statistical mechanics and later adapted to proteins in folding-related studies (104). In proteins, an analysis of the frequencies of interatomic contacts is carried out on a training set of crystal structures: the most favorable interactions should be located in the maxima of the frequencies' distributions. The contribution to the binding free energy of each atom pair is calculated according to the collected statistics and the total binding score is generated through a summation over all the interactions. On average, knowledge-based scoring functions achieve satisfactory performances, they are very fast, and comparatively easy to implement. The results they provide are not greatly dependent on the nature of the training set, unlike the results of regression-based methods. However, knowledge-based scoring functions can lack physical rigor: in systems of equal particles in thermodynamic equilibrium, an n-particle correlation can be translated into a potential that gives an average force over all the configurations of the system. However, atoms in protein ligand complexes are not equal and a set of crystal structures cannot be considered a system in equilibrium. In 2000, Gohlke and colleagues developed and validated DrugScore, a knowledge-based scoring function that increased the accuracy of FlexX by up to 75% on a test set of over 150 proteins (120). DrugScore's main equation relates two different kinds of contributions: distance-dependent pair potentials between atoms of the ligand and atoms of the protein and a one-body potential term that accounts for the surfaces in the two molecules (SAS0) that become buried upon complex formation (SAS). The preference ΔW for a specific ligand pose is expressed as:

Eq.5

considering ki atoms of the ligand and lj atoms of the protein.

The role of hydrogen atoms and the entropic contribution to the binding free energy are considered implicitly in this kind of calculation. Of several scoring functions that have been reported, SMoG2001 (121) and M-Score (122) are worth mentioning.

3.6.4. Consensus scoring

The consensus approach is a very straightforward attempt to overcome the limitations of the currently available scoring functions. Rather than a single method, scoring functions are combined to evaluate the generated poses (123-124). This approach is reported to significantly enhance the accuracy of the results. The consensus approach is unlikely to outperform the most accurate of the scoring functions used. However, in a real life scenario, it is not possible to know in advance which scoring scheme is going to perform better or worse than the others. By using several of them, the below-average performance of a particular scoring function is less likely to affect the overall quality of the results (125). Attempting a different strategy to address the scoring problem, we considered the poses' ensemble as an actual collection of observed data to be dealt with using a statistical approach (126-127). The application of advanced clustering techniques significantly improved the accuracy of the docking results, establishing a significant correlation between cluster population and the presence of a near-native pose.

4. Comparing Docking Protocols

In 2010, over 70 docking engines are commercially or freely available to the scientific community, and new implementations are reported every month (43). Several groups (109,128,132) have attempted to compare the performances of different docking protocols, in terms of both sampling and scoring, to answer legitimate and compelling questions. Which docking tool works best? Do we really need so many implementations? Are new programs performing better than the old ones? Although no method appears to systematically outperform the others, researchers did identify combinations of sampling and scoring that perform better for a specific target or series of compounds. In this light, comparative reports can help select the docking approach most appropriate to the task at hand, and so interest in this kind of exercise remains high. However, setting up a truly fair comparison is far from simple for a number of reasons, discussed in two excellent reviews by Cole et al. (133) and, more recently, by Hawkins et al. (22). An indirect comparison based solely on the results obtained by each tool in the validation process would hardly provide robust conclusions: each tool was usually validated against a benchmark of specifically collected crystallographic complexes which only marginally and accidentally overlapped with those used for other programs. Furthermore, structures in each set were selected according to different criteria and the success rates were not estimated in a uniform way. In a direct comparison, different tools are tested against a purposely compiled set of co-crystals. Ideally, such a set should only include high quality structures to limit the chance of failure due to intrinsic biases of the crystals and not limitations of the docking approach. As mentioned, structures should also display a high level of diversity. Hartshorn and colleagues (21) reported specific guidelines for compiling a set of protein structures to validate and compare docking tools. According to these guidelines, they compiled a publicly available benchmark of 85 structures of pharmaceutical relevance. A fair comparison should also consider the user's knowledge of the docking protocols compared in the study as well as the user's familiarity with the systems included in the test set. Several studies reported a consistent improvement of docking accuracies when protocols were purposely adapted to the system, often deviating quite significantly from the set of default or suggested parameters (134). Similarly, the outcomes improved when the setup exploited expert knowledge of the system biology, i.e. manually assigning hystidines' tautomeric states or including crystallographic water molecules that greatly affect the binding event.

5. Future Perspectives

Modern drug discovery is evolving and new approaches such as fragment-based drug design (135-136) and polypharmacology (137) are steadily becoming more and more popular. Can currently available docking schemes, developed to predict the binding mode of a lead-like molecule to the rigid binding site of a single target, efficiently assist these new paradigms? Preliminary evidence suggests that the available protocols will need significant updates and improvements but will not have to be rethought from scratch (138).

For example, during validation and fine tuning, traditional assessments of accuracy could be complemented by figures of merit solely based on ligand-receptor interactions, such as interaction fingerprints (139), that better describe the behavior of fragments with respect to RMSD. Scoring functions will have to be recalibrated to accurately predict the binding mode of weak binders, since fragments tend to display experimental Ki values in the high micromolar range (140). Again, the entropic contributions to the binding energy will have to be calculated much more accurately since rough approximations proportional to the number of torsional degrees of freedom work acceptably for lead-like compounds but would dramatically fail in simulations involving low affinity fragments.

Polypharmacology, namely the ability of a compound to modulate multiple targets at the same time, is emerging as the leading strategy for interacting with complex pathologies, overcoming back-up and redundant mechanisms in a disease network (141). A truly efficient and systematic implementation of docking in multi-target strategies will require a simultaneous treatment of different binding sites, most likely exploiting procedures usually used in pharmacophore search and ligand-based strategies (142-143). The preliminary results obtained by docking simulations in the field of multi-target ligand development suggest that this technique will be a valuable tool in future research. For example, Chronic Myelogenous Leukemia (CML) is an aggressive neoplasy characterized by an unregulated overgrowth of the myeloid cells in the bone marrow. It is a good example of a complex disease (144). CML is triggered by a reciprocal translocation in chromosomes 9 and 22, resulting in an aberrant chromosomic structure known as Philadelphia; the Philadelphia translocation creates Bcr-Abl, an oncogenic fusion gene translated into a constitutively active tyrosine kinase domain. The kinase activity deeply affects several cell cycle regulators, boosting myeloid cells' proliferation rate. CML is currently treated with tyrosine kinase inhibitors such as Imatinib (145). However, the emergence of Imatinib-resistant tumor clones in patients treated with Bcr-Abl kinase inhibitors led to the development of novel molecules that could interact with several variants of Bcr-Abl or with Bcr-Abl and other targets(144). Ligand docking has been used to identify dual inhibitors of Bcr-Abl and Src, a member of a proto-oncogenic tyrosine kinase family that emerged as an ideal co-target, since it is overexpressed in leukemia cells and participates in CML development (146). Manetti et al. reported a lead discovery protocol built on a combination of molecular dynamics simulations and docking studies that led to the development of dual c-Src/Abl kinases inhibitors (147). The same enzyme combination was also targeted by a series of pyrazolopyrimidines designed using a consensus application of two different docking engines.

Ligand docking applications in multi-target ligand identification is not limited to molecules used to treat CML: in combination with pharmacophore matching, it was also a key step in identifying three dual inhibitors that target human leukotriene A4 hydrolase (LTA4H-h) and the human nonpancreatic secretory phospholipase A2 (hnps-PLA2) (148). Both enzymes are involved in the arachidonic acid metabolism and a concurrent inhibition of the same pathway in two different spots is considered a promising strategy in treating inflammation. Jenwitheesuk and colleagues reported the use of docking in combination with other computational techniques to develop molecules active against HIV-1 retrovirus and, at the same time, other pathogens responsible for opportunistic infections (149). Although the multi-target profile is limited here to very closely related enzymes or mutants of the same target, the reported studies clearly show that ligand docking can beneficially assist in the discovery of multi-target ligands.

6. Conclusions

Over the last 30 years, the reliability of docking protocols has improved constantly, to the point where the most recent implementations address quite efficiently some classic shortcomings of the technique. Several issues remain, primarily the accuracy of binding energy predictions. But important breakthroughs are expected thanks to the ever-increasing computational power of multicore CPUs (150). Presently, ligand docking plays an important role, especially in the hit-to-lead phase of drug discovery projects where it helps rationalize SAR data and design novel decorations (151). Moreover, thanks to the variety of docking software available, personalized protocols can be devised for specific targets or specific ligand-target combinations. In summary, ligand docking is now a valuable part of almost every structure-based drug design study carried out in both academia and industry.

7. REFERENCES

1. E. Fischer: Einfluss der Configuration auf die Wirkung der Enzyme. Berichte der deutschen chemischen Gesellschaft, 27(3), 2985-2993 (1894)

2. J. Drews: Paul Ehrlich: Magister Mundi. Nat Rev Drug Discov, 3(9), 797-801 (2004)
doi:10.1038/nrd1498
PMid:15340389

3. H. Kubiny: Strategies and recent technologies in drug discovery. Pharmazie, 50(10), 647-662 (1995)

4. A. C. Anderson: The process of structure-based drug design. Chem Biol, 10(9), 787-97 (2003)
doi:10.1016/j.chembiol.2003.09.002

5. W. L. Jorgensen: The many roles of computation in drug discovery. Science, 303(5665), 1813-8 (2004)
doi:10.1126/science.1096361
PMid:15031495

6. M. Bacilieri and S. Moro: Ligand-based drug design methodologies in drug discovery process: an overview. Curr Drug Discov Technol, 3(3), 155-65 (2006)
doi:10.2174/157016306780136781
PMid:17311561

7. A. Pozzan: Molecular descriptors and methods for ligand based virtual high throughput screening in drug discovery. Curr Pharm Des, 12(17), 2099-110 (2006)
doi:10.2174/138161206777585247
PMid:16796558

8. J. Blaney and J. Dixon: A good ligand is hard to find: Automated docking methods. Perspectives in Drug Discovery and Design, 1(2), 301-319 (1993)
doi:10.1007/BF02174531

9. W. L. Jorgensen: Efficient Drug Lead Discovery and Optimization. Accounts of Chemical Research, 42(6), 724-733 (2009)
doi:10.1021/ar800236t
PMid:19317443    PMCid:2727934

10. G. Schneider and U. Fechner: Computer-based de novo design of drug-like molecules. Nat Rev Drug Discov, 4(8), 649-63 (2005)
doi:10.1038/nrd1799
PMid:16056391

11. I. D. Kuntz, J. M. Blaney, S. J. Oatley, R. Langridge and T. E. Ferrin: A geometric approach to macromolecule-ligand interactions. J Mol Biol, 161(2), 269-88 (1982)
doi:10.1016/0022-2836(82)90153-X

12. J. J. Gray: High-resolution protein-protein docking. Current Opinion in Structural Biology, 16(2), 183-193 (2006)
doi:10.1016/j.sbi.2006.03.003
PMid:16546374

13. D. W. Ritchie: Recent progress and future directions in protein-protein docking. Curr Protein Pept Sci, 9(1), 1-15 (2008)
doi:10.2174/138920308783565741
PMid:18336319

14. E. Karaca, A. S. J. Melquiond, S. J. de Vries, P. L. Kastritis and A. M. J. J. Bonvin: Building macromolecular assemblies by information-driven docking: introducing the HADDOCK multi-body docking server. Molecular & Cellular Proteomics, - (2010)

15. S. Fulle and H. Gohlke: Molecular recognition of RNA: challenges for modelling interactions and plasticity. Journal of Molecular Recognition, 23(2), 220-231
PMid:19941322

16. N. Brooijmans and I. D. Kuntz: Molecular recognition and docking algorithms. Annu Rev Biophys Biomol Struct, 32, 335-73 (2003)
doi:10.1146/annurev.biophys.32.110601.142532
PMid:12574069

17. A. R. Leach, B. K. Shoichet and C. E. Peishoff: Prediction of protein-ligand interactions. Docking and scoring: successes and gaps. J Med Chem, 49(20), 5851-5 (2006)
doi:10.1021/jm060999m
PMid:17004700

18. K. L. Damm and H. A. Carlson: Exploring experimental sources of multiple protein conformations in structure-based drug design. J Am Chem Soc, 129(26), 8225-35 (2007)
doi:10.1021/ja0709728
PMid:17555316

19. H. M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. Weissig, I. N. Shindyalov and P. E. Bourne: The Protein Data Bank. Nucleic Acids Res, 28(1), 235-42 (2000)
doi:10.1093/nar/28.1.235
PMid:10592235    PMCid:102472

20. R. Abagyan and I. Kufareva: The flexible pocketome engine for structural chemogenomics. Methods Mol Biol, 575, 249-79 (2009)
doi:10.1007/978-1-60761-274-2_11
PMid:19727619

21. M. J. Hartshorn, M. L. Verdonk, G. Chessari, S. C. Brewerton, W. T. Mooij, P. N. Mortenson and C. W. Murray: Diverse, high-quality test set for the validation of protein-ligand docking performance. J Med Chem, 50(4), 726-41 (2007)
doi:10.1021/jm061277y
PMid:17300160

22. P. C. Hawkins, G. L. Warren, A. G. Skillman and A. Nicholls: How to do an evaluation: pitfalls and traps. J Comput Aided Mol Des, 22(3-4), 179-90 (2008)
doi:10.1007/s10822-007-9166-3
PMid:18217218    PMCid:2270916

23. S. Y. Huang and X. Zou: Efficient molecular docking of NMR structures: application to HIV-1 protease. Protein Sci, 16(1), 43-51 (2007)
doi:10.1110/ps.062501507
PMid:17123961    PMCid:2222846

24. R. M. Knegtel, I. D. Kuntz and C. M. Oshiro: Molecular docking to ensembles of protein structures. J Mol Biol, 266(2), 424-40 (1997)
doi:10.1006/jmbi.1996.0776
PMid:9047373

25. H. Fan, J. J. Irwin, B. M. Webb, G. Klebe, B. K. Shoichet and A. Sali: Molecular docking screens using comparative models of proteins. J Chem Inf Model, 49(11), 2512-27 (2009)
doi:10.1021/ci9003706
PMid:19845314    PMCid:2790034

26. S. L. McGovern and B. K. Shoichet: Information decay in molecular docking screens against holo, apo, and modeled conformations of enzymes. J Med Chem, 46(14), 2895-907 (2003)
doi:10.1021/jm0300330
PMid:12825931

27. N. Eswar, A. Sali, B. T. John and J. T. David: Comparative Modeling of Drug Target Proteins. In: Comprehensive Medicinal Chemistry II. Elsevier, Oxford (2007)
doi:10.1016/B0-08-045044-X/00251-0

28. P. Ferrara and E. Jacoby: Evaluation of the utility of homology models in high throughput docking. Journal of Molecular Modeling, 13(8), 897-905 (2007)
doi:10.1007/s00894-007-0207-6
PMid:17487515

29. M. Jacobson, A. Sali and A. M. Doherty: Comparative Protein Structure Modeling and its Applications to Drug Discovery. In: Annual Reports in Medicinal Chemistry. Academic Press, (2004)

30. D. B. Kitchen, H. Decornez, J. R. Furr and J. Bajorath: Docking and scoring in virtual screening for drug discovery: methods and applications. Nat Rev Drug Discov, 3(11), 935-49 (2004)
doi:10.1038/nrd1549
PMid:15520816

31. M. L. Connolly: Solvent-accessible surfaces of proteins and nucleic acids. Science, 221(4612), 709-13 (1983)
doi:10.1126/science.6879170
PMid:6879170

32. B. Lee and F. M. Richards: The interpretation of protein structures: estimation of static accessibility. J Mol Biol, 55(3), 379-400 (1971)
doi:10.1016/0022-2836(71)90324-X

33. I. Halperin, B. Ma, H. Wolfson and R. Nussinov: Principles of docking: An overview of search algorithms and a guide to scoring functions. Proteins, 47(4), 409-43 (2002)
doi:10.1002/prot.10115
PMid:12001221

34. P. J. Goodford: A computational procedure for determining energetically favorable binding sites on biologically important macromolecules. J Med Chem, 28(7), 849-57 (1985)
doi:10.1021/jm00145a002
PMid:3892003

35. A. T. Laurie and R. M. Jackson: Methods for the prediction of protein-ligand binding sites for structure-based drug design and virtual ligand screening. Curr Protein Pept Sci, 7(5), 395-406 (2006)
doi:10.2174/138920306778559386

36. G. Bottegoni, I. Kufareva, M. Totrov and R. Abagyan: A new method for ligand docking to flexible receptors by dual alanine scanning and refinement (SCARE). J Comput Aided Mol Des, 22(5), 311-25 (2008)
doi:10.1007/s10822-008-9188-5
PMid:18273556    PMCid:2641994

37. C. N. Cavasotto and R. A. Abagyan: Protein flexibility in ligand docking and virtual screening to protein kinases. J Mol Biol, 337(1), 209-25 (2004)
doi:10.1016/j.jmb.2004.01.003
PMid:15001363

38. M. L. Verdonk, G. Chessari, J. C. Cole, M. J. Hartshorn, C. W. Murray, J. W. Nissink, R. D. Taylor and R. Taylor: Modeling water molecules in protein-ligand docking using GOLD. J Med Chem, 48(20), 6504-15 (2005)
doi:10.1021/jm050543p
PMid:16190776

39. C. R. Corbeil and N. Moitessier: Docking ligands into flexible and solvated macromolecules. 3. Impact of input ligand conformation, protein flexibility, and water molecules on the accuracy of docking programs. J Chem Inf Model, 49(4), 997-1009 (2009)
doi:10.1021/ci8004176
PMid:19391631

40. E. Perola and P. S. Charifson: Conformational Analysis of Drug-Like Molecules Bound to Proteins: An Extensive Study of Ligand Reorganization upon Binding. Journal of Medicinal Chemistry, 47(10), 2499-2510 (2004)
doi:10.1021/jm030563w
PMid:15115393

41. W. Welch, J. Ruppert and A. N. Jain: Hammerhead: fast, fully automated docking of flexible ligands to protein binding sites. Chemistry & Biology, 3(6), 449-462 (1996)
doi:10.1016/S1074-5521(96)90093-9

42. R. D. Taylor, P. J. Jewsbury and J. W. Essex: A review of protein-small molecule docking methods. J Comput Aided Mol Des, 16(3), 151-66 (2002)
doi:10.1023/A:1020155510718
PMid:12363215

43. N. Moitessier, P. Englebienne, D. Lee, J. Lawandi and C. R. Corbeil: Towards the development of universal, fast and highly accurate docking/scoring methods: a long way to go. Br J Pharmacol, 153 Suppl 1, S7-26 (2008)
doi:10.1038/sj.bjp.0707515
PMid:18037925    PMCid:2268060

44. T. J. Ewing, S. Makino, A. G. Skillman and I. D. Kuntz: DOCK 4.0: search strategies for automated molecular docking of flexible molecule databases. J Comput Aided Mol Des, 15(5), 411-28 (2001)
doi:10.1023/A:1011115820450
PMid:11394736

45. M. Rarey, B. Kramer, T. Lengauer and G. Klebe: A fast flexible docking method using an incremental construction algorithm. J Mol Biol, 261(3), 470-89 (1996)
doi:10.1006/jmbi.1996.0477
PMid:8780787

46. A. N. Jain: Surflex-Dock 2.1: robust performance from ligand energetic modeling, ring flexibility, and knowledge-based search. J Comput Aided Mol Des, 21(5), 281-306 (2007)
doi:10.1007/s10822-007-9114-2
PMid:17387436

47. A. N. Jain: Surflex: fully automatic flexible molecular docking using a molecular similarity-based search engine. J Med Chem, 46(4), 499-511 (2003)
doi:10.1021/jm020406h
PMid:12570372

48. R. Abagyan and M. Totrov: High-throughput docking for lead generation. Curr Opin Chem Biol, 5(4), 375-82 (2001)
doi:10.1016/S1367-5931(00)00217-9

49. P. Willett: Genetic algorithms in molecular recognition and design. Trends Biotechnol, 13(12), 516-21 (1995)
doi:10.1016/S0167-7799(00)89015-0

50. G. M. Morris, D. S. Goodsell, R. S. Halliday, R. Huey, W. E. Hart, R. K. Belew and A. J. Olson: Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. Journal of Computational Chemistry, 19(14), 1639-1662 (1998)
doi:10.1002/(SICI)1096-987X(19981115)19:14<1639::AID-JCC10>3.0.CO;2-B

51. G. Jones, P. Willett, R. C. Glen, A. R. Leach and R. Taylor: Development and validation of a genetic algorithm for flexible docking. Journal of Molecular Biology, 267(3), 727-748 (1997)
doi:10.1006/jmbi.1996.0897
PMid:9126849

52. N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller and E. Teller: Equation of State Calculations by Fast Computing Machines. The Journal of Chemical Physics, 21(6), 1087-1092 (1953)
doi:10.1063/1.1699114

53. R. Abagyan, M. Totrov and D. Kuznetsov: Icm - a New Method for Protein Modeling and Design - Applications to Docking and Structure Prediction from the Distorted Native Conformation. Journal of Computational Chemistry, 15(5), 488-506 (1994)
doi:10.1002/jcc.540150503

54. R. Abagyan and M. Totrov: Biased Probability Monte-Carlo Conformational Searches and Electrostatic Calculations for Peptides and Proteins. Journal of Molecular Biology, 235(3), 983-1002 (1994)
doi:10.1006/jmbi.1994.1052
PMid:8289329

55. R. A. Friesner, J. L. Banks, R. B. Murphy, T. A. Halgren, J. J. Klicic, D. T. Mainz, M. P. Repasky, E. H. Knoll, M. Shelley, J. K. Perry, D. E. Shaw, P. Francis and P. S. Shenkin: Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. J Med Chem, 47(7), 1739-49 (2004)
doi:10.1021/jm0306430
PMid:15027865

56. L. C. Roisman, J. Piehler, J. Y. Trosset, H. A. Scheraga and G. Schreiber: Structure of the interferon-receptor complex determined by distance constraints from double-mutant cycles and flexible docking. Proc Natl Acad Sci U S A, 98(23), 13231-6 (2001)
doi:10.1073/pnas.221290398
PMid:11698684    PMCid:60853

57. M. Liu and S. Wang: MCDOCK: a Monte Carlo simulation approach to the molecular docking problem. J Comput Aided Mol Des, 13(5), 435-51 (1999)
doi:10.1023/A:1008005918983
PMid:10483527

58. A. Cavalli, G. Bottegoni, C. Raco, M. De Vivo and M. Recanatini: A computational study of the binding of propidium to the peripheral anionic site of human acetylcholinesterase. J Med Chem, 47(16), 3991-9 (2004)
doi:10.1021/jm040787u
PMid:15267237

59. A. Di Nola, D. Roccatano and H. J. Berendsen: Molecular dynamics simulation of the docking of substrates to proteins. Proteins, 19(3), 174-82 (1994)
doi:10.1002/prot.340190303
PMid:7937732

60. N. Nakajima, J. Higo, A. Kidera and H. Nakamura: Flexible docking of a ligand peptide to a receptor protein by multicanonical molecular dynamics simulation. Chemical Physics Letters, 278(4-6), 297-301 (1997)
doi:10.1016/S0009-2614(97)01074-9

61. F. L. Gervasio, A. Laio and M. Parrinello: Flexible docking in solution using metadynamics. J Am Chem Soc, 127(8), 2600-7 (2005)
doi:10.1021/ja0445950
PMid:15725015

62. M. Masetti, A. Cavalli, M. Recanatini and F. L. Gervasio: Exploring complex protein-ligand recognition mechanisms with coarse metadynamics. J Phys Chem B, 113(14), 4807-16 (2009)
doi:10.1021/jp803936q
PMid:19298042

63. D. E. Koshland: Application of a Theory of Enzyme Specificity to Protein Synthesis. Proc Natl Acad Sci U S A, 44(2), 98-104 (1958)
doi:10.1073/pnas.44.2.98

64. D. D. Boehr, R. Nussinov and P. E. Wright: The role of dynamic conformational ensembles in biomolecular recognition. Nat Chem Biol, 5(11), 789-96 (2009)
doi:10.1038/nchembio.232
PMid:19841628    PMCid:2916928

65. H. A. Carlson and J. A. McCammon: Accommodating protein flexibility in computational drug design. Mol Pharmacol, 57(2), 213-8 (2000)
PMid:10648630

66. F. Osterberg, G. M. Morris, M. F. Sanner, A. J. Olson and D. S. Goodsell: Automated docking to multiple target structures: incorporation of protein mobility and structural water heterogeneity in AutoDock. Proteins, 46(1), 34-40 (2002)
doi:10.1002/prot.10028
PMid:11746701

67. M. Totrov and R. Abagyan: Flexible ligand docking to multiple receptor conformations: a practical alternative. Curr Opin Struct Biol, 18(2), 178-84 (2008)
PMid:18302984    PMCid:2396190

68. F. Jiang and S. H. Kim: "Soft docking": matching of molecular surface cubes. J Mol Biol, 219(1), 79-102 (1991)
doi:10.1016/0022-2836(91)90859-5

69. S. J. Teague: Implications of protein flexibility for drug discovery. Nat Rev Drug Discov, 2(7), 527-41 (2003)
doi:10.1038/nrd1129
PMid:12838268

70. M. Totrov and R. Abagyan: Flexible protein-ligand docking by global energy optimization in internal coordinates. Proteins, Suppl 1, 215-20 (1997)

71. A. R. Leach: Ligand docking to proteins with discrete side-chain flexibility. J Mol Biol, 235(1), 345-56 (1994)
doi:10.1016/S0022-2836(05)80038-5

72. E. Althaus, O. Kohlbacher, H. P. Lenhof and P. Muller: A combinatorial approach to protein docking with flexible side chains. J Comput Biol, 9(4), 597-612 (2002)
doi:10.1089/106652702760277336
PMid:12323095

73. L. Schaffer and G. M. Verkhivker: Predicting structural effects in HIV-1 protease mutant complexes with flexible ligand docking and protein side-chain optimization. Proteins, 33(2), 295-310 (1998)
doi:10.1002/(SICI)1097-0134(19981101)33:2<295::AID-PROT12>3.0.CO;2-F

74. M. D. Daily, D. Masica, A. Sivasubramanian, S. Somarouthu and J. J. Gray: CAPRI rounds 3-5 reveal promising successes and future challenges for RosettaDock. Proteins, 60(2), 181-6 (2005)
doi:10.1002/prot.20555
PMid:15981262

75. J. Meiler and D. Baker: ROSETTALIGAND: protein-small molecule docking with full side-chain flexibility. Proteins, 65(3), 538-48 (2006)
doi:10.1002/prot.21086
PMid:16972285

76. X. Barril and S. D. Morley: Unveiling the Full Potential of Flexible Receptor Docking Using Multiple Crystallographic Structures. J. Med. Chem., 48(13), 4432-4443 (2005)
doi:10.1021/jm048972v
PMid:15974595

77. R. M. A. Knegtel, I. D. Kuntz and C. M. Oshiro: Molecular docking to ensembles of protein structures. Journal of Molecular Biology, 266(2), 424-440 (1997)
doi:10.1006/jmbi.1996.0776
PMid:9047373

78. K. D. Shoichet, I. D. Kuntz and D. L. Bodian: Molecular docking using shape descriptors. Journal of Computational Chemistry, 13(3), 380-397 (1992)
doi:10.1002/jcc.540130311

79. S. Y. Huang and X. Zou: Ensemble docking of multiple protein structures: considering protein structural variations in molecular docking. Proteins, 66(2), 399-421 (2007)
doi:10.1002/prot.21214
PMid:17096427

80. T. J. A. Ewing, S. Makino, A. G. Skillman and I. D. Kuntz: DOCK 4.0: Search strategies for automated molecular docking of flexible molecule databases. Journal of Computer-Aided Molecular Design, 15(5), 411-428 (2001)
doi:10.1023/A:1011115820450
PMid:11394736

81. M. Garrett, D. Goodsell, R. Halliday, R. Huey, W. Hart, R. Belew and A. J. Olson: Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. Journal of Computational Chemistry, 19(14), 1639-1662 (1998)
doi:10.1002/(SICI)1096-987X(19981115)19:14<1639::AID-JCC10>3.0.CO;2-B

82. T. Polgar and G. M. Keseru: Ensemble docking into flexible active sites. Critical evaluation of FlexE against JNK-3 and beta-secretase. J Chem Inf Model, 46(4), 1795-805 (2006)
doi:10.1021/ci050412x
PMid:16859311

83. H. Claussen, C. Buning, M. Rarey and T. Lengauer: FlexE: efficient molecular docking considering protein structure variations. J Mol Biol, 308(2), 377-95 (2001)
doi:10.1006/jmbi.2001.4551
PMid:11327774

84. S. Rao, P. C. Sanschagrin, J. R. Greenwood, M. P. Repasky, W. Sherman and R. Farid: Improving database enrichment through ensemble docking. J Comput Aided Mol Des, 22(9), 621-7 (2008)
doi:10.1007/s10822-008-9182-y
PMid:18253700

85. M. Rueda, G. Bottegoni and R. Abagyan: Consistent improvement of cross-docking results using binding site ensembles generated with elastic network normal modes. J Chem Inf Model, 49(3), 716-25 (2009)
doi:10.1021/ci8003732
PMid:19434904    PMCid:2891173

86. G. Bottegoni, I. Kufareva, M. Totrov and R. Abagyan: Four-dimensional docking: a fast and accurate account of discrete receptor flexibility in ligand docking. J Med Chem, 52(2), 397-406 (2009)
doi:10.1021/jm8009958
PMid:19090659    PMCid:2662720

87. C. R. Corbeil, P. Englebienne and N. Moitessier: Docking ligands into flexible and solvated macromolecules. 1. Development and validation of FITTED 1.0. J Chem Inf Model, 47(2), 435-49 (2007)
doi:10.1021/ci6002637
PMid:17305329

88. Y. Zhao and M. F. Sanner: FLIPDock: docking flexible ligands into flexible receptors. Proteins, 68(3), 726-37 (2007)
doi:10.1002/prot.21423
PMid:17523154

89. W. Sherman, T. Day, M. P. Jacobson, R. A. Friesner and R. Farid: Novel procedure for modeling ligand/receptor induced fit effects. J Med Chem, 49(2), 534-53 (2006)
doi:10.1021/jm050540c
PMid:16420040

90. J. H. Lin, A. L. Perryman, J. R. Schames and J. A. McCammon: Computational drug design accommodating receptor flexibility: the relaxed complex scheme. J Am Chem Soc, 124(20), 5632-3 (2002)
doi:10.1021/ja0260162
PMid:12010024

91. R. E. Amaro, R. Baron and J. A. McCammon: An improved relaxed complex scheme for receptor flexibility in computer-aided drug design. J Comput Aided Mol Des, 22(9), 693-705 (2008)
doi:10.1007/s10822-007-9159-2
PMid:18196463    PMCid:2516539

92. G. M. Morris, R. Huey, W. Lindstrom, M. F. Sanner, R. K. Belew, D. S. Goodsell and A. J. Olson: AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. J Comput Chem, 30(16), 2785-91 (2009)
doi:10.1002/jcc.21256
PMid:19399780

93. P. A. Kollman, I. Massova, C. Reyes, B. Kuhn, S. H. Huo, L. Chong, M. Lee, T. Lee, Y. Duan, W. Wang, O. Donini, P. Cieplak, J. Srinivasan, D. A. Case and T. E. Cheatham: Calculating structures and free energies of complex molecules: Combining molecular mechanics and continuum models. Accounts of Chemical Research, 33(12), 889-897 (2000)
doi:10.1021/ar000033j
PMid:11123888

94. D. C. Thompson, C. Humblet and D. Joseph-McCarthy: Investigation of MM-PBSA rescoring of docking poses. Journal of Chemical Information and Modeling, 48(5), 1081-1091 (2008)
doi:10.1021/ci700470c
PMid:18465849

95. M. S. Lee and M. A. Olson: Calculation of absolute protein-ligand binding affinity using path and endpoint approaches. Biophysical Journal, 90(3), 864-877 (2006)
doi:10.1529/biophysj.105.071589
PMid:16284269    PMCid:1367111

96. J. M. J. Swanson, R. Henchman and J. A. McCammon: Revisiting free energy calculations: One step closer to rigorous scoring functions and one step beyond MM/PBSA. Abstracts of Papers of the American Chemical Society, 227, U903-U904 (2004)

97. J. R. Schames, R. H. Henchman, J. S. Siegel, C. A. Sotriffer, H. Ni and J. A. McCammon: Discovery of a novel binding trench in HIV integrase. J Med Chem, 47(8), 1879-81 (2004)
doi:10.1021/jm0341913
PMid:15055986

98. A. Babakhani, T. T. Talley, P. Taylor and J. A. McCammon: A virtual screening study of the acetylcholine binding protein using a relaxed-complex approach. Comput Biol Chem, 33(2), 160-70 (2009)
doi:10.1016/j.compbiolchem.2008.12.002
PMid:19186108    PMCid:2684879

99. J. D. Durrant, C. A. F. de Oliveira and J. A. McCammon: Including receptor flexibility and induced fit effects into the design of MMP-2 inhibitors. Journal of Molecular Recognition, 23(2), 173-182 (2010)
PMid:19882751    PMCid:2950069

100. S. A. Hassan, L. Gracia, G. Vasudevan and P. J. Steinbach: Computer simulation of protein-ligand interactions: challenges and applications. Methods Mol Biol, 305, 451-92 (2005)
PMid:15940011

101. H. Gohlke and G. Klebe: Approaches to the description and prediction of the binding affinity of small-molecule ligands to macromolecular receptors. Angew Chem Int Ed Engl, 41(15), 2644-76 (2002)
doi:10.1002/1521-3773(20020802)41:15<2644::AID-ANIE2644>3.0.CO;2-O

102. M. K. Gilson and H. X. Zhou: Calculation of protein-ligand binding affinities. Annu Rev Biophys Biomol Struct, 36, 21-42 (2007)
doi:10.1146/annurev.biophys.36.040306.132550
PMid:17201676

103. P. Kollman: Free energy calculations: Applications to chemical and biochemical phenomena. Chemical Reviews, 93(7), 2395-2417 (1993)
doi:10.1021/cr00023a004

104. J. G. Kirkwood: Statistical Mechanics of Fluid Mixtures. The Journal of Chemical Physics, 3(5), 300-313 (1935)
doi:10.1063/1.1749657

105. W. L. Jorgensen: Efficient drug lead discovery and optimization. Acc Chem Res, 42(6), 724-33 (2009)
doi:10.1021/ar800236t
PMid:19317443    PMCid:2727934

106. A. Laio and M. Parrinello: Escaping free-energy minima. Proc Natl Acad Sci U S A, 99(20), 12562-6 (2002)
doi:10.1073/pnas.202427399
PMid:12271136    PMCid:130499

107. J. Aqvist, C. Medina and J. E. Samuelsson: New Method for Predicting Binding-Affinity in Computer-Aided Drug Design. Protein Engineering, 7(3), 385-391 (1994)
doi:10.1093/protein/7.3.385

108. T. Simonson, G. Archontis and M. Karplus: Free energy simulations come of age: Protein-ligand recognition. Accounts of Chemical Research, 35(6), 430-437 (2002)
doi:10.1021/ar010030m
PMid:12069628

109. P. Ferrara, H. Gohlke, D. J. Price, G. Klebe and C. L. Brooks, 3rd: Assessing scoring functions for protein-ligand interactions. J Med Chem, 47(12), 3032-47 (2004)
doi:10.1021/jm030489h
PMid:15163185

110. P. Cieplak, J. Caldwell and P. Kollman: Molecular mechanical models for organic and biological systems going beyond the atom centered two body additive approximation: Aqueous solution free energies of methanol and N-methyl acetamide, nucleic acid base, and amide hydrogen bonding and chloroform/water partition coefficients of the nucleic acid bases. Journal of Computational Chemistry, 22(10), 1048-1057 (2001)
doi:10.1002/jcc.1065

111. B. Kramer, M. Rarey and T. Lengauer: Evaluation of the FLEXX incremental construction algorithm for protein-ligand docking. Proteins, 37(2), 228-41 (1999)
doi:10.1002/(SICI)1097-0134(19991101)37:2<228::AID-PROT8>3.0.CO;2-8

112. N. Gresh: Development, validation, and applications of anisotropic polarizable molecular mechanics to study ligand and drug-receptor interactions. Curr Pharm Des, 12(17), 2121-58 (2006)
doi:10.2174/138161206777585256
PMid:16796560

113. M. Totrov and R. Abagyan: Derivation of sensitive discrimination potential for virtual screening. In: RECOMB '99. Proceedings of the Third Annual International Conference on Computational Molecular Biology. ACM Press - New York, Lyon (France) (1999)

114. M. D. Eldridge, C. W. Murray, T. R. Auton, G. V. Paolini and R. P. Mee: Empirical scoring functions: I. The development of a fast empirical scoring function to estimate the binding affinity of ligands in receptor complexes. J Comput Aided Mol Des, 11(5), 425-45 (1997)
doi:10.1023/A:1007996124545
PMid:9385547

115. P. Cozzini, M. Fornabaio, A. Marabotti, D. J. Abraham, G. E. Kellogg and A. Mozzarelli: Simple, intuitive calculations of free energy of binding for protein-ligand complexes. 1. Models without explicit constrained water. J Med Chem, 45(12), 2469-83 (2002)
doi:10.1021/jm0200299
PMid:12036355

116. R. D. Head, M. L. Smythe, T. I. Oprea, C. L. Waller, S. M. Green and G. R. Marshall: VALIDATE: A New Method for the Receptor-Based Prediction of Binding Affinities of Novel Ligands. Journal of the American Chemical Society, 118(16), 3959-3969 (1996)
doi:10.1021/ja9539002

117. H. J. Bohm: LUDI: rule-based automatic design of new substituents for enzyme inhibitor leads. J Comput Aided Mol Des, 6(6), 593-606 (1992)
doi:10.1007/BF00126217

118. G. Klebe and T. Mietzner: A fast and efficient method to generate biologically relevant conformations. J Comput Aided Mol Des, 8(5), 583-606 (1994)
doi:10.1007/BF00123667
PMid:7876902

119. I. Muegge and Y. C. Martin: A general and fast scoring function for protein-ligand interactions: a simplified potential approach. J Med Chem, 42(5), 791-804 (1999)
doi:10.1021/jm980536j
PMid:10072678

120. H. Gohlke and G. Klebe: Statistical potentials and scoring functions applied to protein-ligand binding. Curr Opin Struct Biol, 11(2), 231-5 (2001)
doi:10.1016/S0959-440X(00)00195-0

121. A. V. Ishchenko and E. I. Shakhnovich: SMall Molecule Growth 2001 (SMoG2001): an improved knowledge-based scoring function for protein-ligand interactions. J Med Chem, 45(13), 2770-80 (2002)
doi:10.1021/jm0105833
PMid:12061879

122. C. Y. Yang, R. Wang and S. Wang: M-score: a knowledge-based potential scoring function accounting for protein atom mobility. J Med Chem, 49(20), 5903-11 (2006)
doi:10.1021/jm050043w
PMid:17004706

123. P. S. Charifson, J. J. Corkery, M. A. Murcko and W. P. Walters: Consensus scoring: A method for obtaining improved hit rates from docking databases of three-dimensional structures into proteins. J Med Chem, 42(25), 5100-9 (1999)
doi:10.1021/jm990352k
PMid:10602695

124. N. Paul and D. Rognan: ConsDock: A new program for the consensus analysis of protein-ligand interactions. Proteins, 47(4), 521-33 (2002)
doi:10.1002/prot.10119
PMid:12001231

125. R. Wang and S. Wang: How does consensus scoring work for virtual library screening? An idealized computer experiment. J Chem Inf Comput Sci, 41(5), 1422-6 (2001)
doi:10.1021/ci010025x

126. G. Bottegoni, W. Rocchia, M. Recanatini and A. Cavalli: AClAP, Autonomous hierarchical agglomerative Cluster Analysis based protocol to partition conformational datasets. Bioinformatics, 22(14), e58-65 (2006)
doi:10.1093/bioinformatics/btl212
PMid:16873522

127. G. Bottegoni, A. Cavalli and M. Recanatini: A comparative study on the application of hierarchical-agglomerative clustering approaches to organize outputs of reiterated docking runs. J Chem Inf Model, 46(2), 852-62 (2006)
doi:10.1021/ci050141q
PMid:16563017

128. C. Bissantz, G. Folkers and D. Rognan: Protein-based virtual screening of chemical databases. 1. Evaluation of different docking/scoring combinations. J Med Chem, 43(25), 4759-67 (2000)
doi:10.1021/jm001044l
PMid:11123984

129. B. D. Bursulaya, M. Totrov, R. Abagyan and C. L. Brooks, 3rd: Comparative study of several algorithms for flexible ligand docking. J Comput Aided Mol Des, 17(11), 755-63 (2003)
doi:10.1023/B:JCAM.0000017496.76572.6f
PMid:15072435

130. T. Cheng, X. Li, Y. Li, Z. Liu and R. Wang: Comparative assessment of scoring functions on a diverse test set. J Chem Inf Model, 49(4), 1079-93 (2009)
doi:10.1021/ci9000053
PMid:19358517

131. J. B. Cross, D. C. Thompson, B. K. Rai, J. C. Baber, K. Y. Fan, Y. Hu and C. Humblet: Comparison of several molecular docking programs: pose prediction and virtual screening accuracy. J Chem Inf Model, 49(6), 1455-74 (2009)
doi:10.1021/ci900056c
PMid:19476350

132. M. Kontoyianni, L. M. McClellan and G. S. Sokol: Evaluation of docking performance: comparative data on docking algorithms. J Med Chem, 47(3), 558-65 (2004)
doi:10.1021/jm0302997
PMid:14736237

133. J. C. Cole, C. W. Murray, J. W. Nissink, R. D. Taylor and R. Taylor: Comparing protein-ligand docking programs is difficult. Proteins, 60(3), 325-32 (2005)
doi:10.1002/prot.20497
PMid:15937897

134. G. L. Warren, C. W. Andrews, A. M. Capelli, B. Clarke, J. LaLonde, M. H. Lambert, M. Lindvall, N. Nevins, S. F. Semus, S. Senger, G. Tedesco, I. D. Wall, J. M. Woolven, C. E. Peishoff and M. S. Head: A critical assessment of docking programs and scoring functions. J Med Chem, 49(20), 5912-31 (2006)
doi:10.1021/jm050362n
PMid:17004707

135. M. Congreve, G. Chessari, D. Tisi and A. J. Woodhead: Recent developments in fragment-based drug discovery. J Med Chem, 51(13), 3661-80 (2008)
doi:10.1021/jm8000373
PMid:18457385

136. R. Morphy and Z. Rankovic: Fragments, network biology and designing multiple ligands. Drug Discov Today, 12(3-4), 156-60 (2007)
doi:10.1016/j.drudis.2006.12.006
PMid:17275736

137. J. D. Durrant, R. E. Amaro, L. Xie, M. D. Urbaniak, M. A. Ferguson, A. Haapalainen, Z. Chen, A. M. Di Guilmi, F. Wunder, P. E. Bourne and J. A. McCammon: A multidimensional strategy to detect polypharmacological targets in the absence of structural and sequence homology. PLoS Comput Biol, 6(1), e1000648 (2010)
doi:10.1371/journal.pcbi.1000648
PMid:20098496    PMCid:2799658

138. Y. Chen and B. K. Shoichet: Molecular docking and ligand specificity in fragment-based inhibitor discovery. Nat Chem Biol, 5(5), 358-64 (2009)
doi:10.1038/nchembio.155
PMid:19305397

139. G. Marcou and D. Rognan: Optimizing fragment and scaffold docking by use of molecular interaction fingerprints. J Chem Inf Model, 47(1), 195-207 (2007)
doi:10.1021/ci600342e
PMid:17238265

140. R. E. Hubbard, I. Chen and B. Davis: Informatics and modeling challenges in fragment-based drug discovery. Curr Opin Drug Discov Devel, 10(3), 289-97 (2007)
PMid:17554855

141. R. Morphy, C. Kay and Z. Rankovic: From magic bullets to designed multiple ligands. Drug Discov Today, 9(15), 641-51 (2004)
doi:10.1016/S1359-6446(04)03163-0

142. A. V. Grigoryan, I. Kufareva, M. Totrov and R. A. Abagyan: Spatial chemical distance based on atomic property fields. J Comput Aided Mol Des (2010)

143. M. Totrov: Atomic property fields: generalized 3D pharmacophoric potential for automated ligand superposition, pharmacophore elucidation and 3D QSAR. Chem Biol Drug Des, 71(1), 15-27 (2008)
doi:10.1111/j.1747-0285.2007.00605.x

144. J. Gora-Tybor and T. Robak: Targeted drugs in chronic myeloid leukemia. Curr Med Chem, 15(29), 3036-51 (2008)
doi:10.2174/092986708786848578
PMid:19075651

145. F. J. Giles, J. E. Cortes and H. M. Kantarjian: Targeting the kinase activity of the BCR-ABL fusion protein in patients with chronic myeloid leukemia. Curr Mol Med, 5(7), 615-23 (2005)
doi:10.2174/156652405774641115
PMid:16305488

146. S. Schenone, F. Manetti and M. Botta: Last findings on dual inhibitors of abl and SRC tyrosine-kinases. Mini Rev Med Chem, 7(2), 191-201 (2007)
doi:10.2174/138955707779802598
PMid:17305593

147. F. Manetti, G. A. Locatelli, G. Maga, S. Schenone, M. Modugno, S. Forli, F. Corelli and M. Botta: A combination of docking/dynamics simulations and pharmacophoric modeling to discover new dual c-Src/Abl kinase inhibitors. J Med Chem, 49(11), 3278-86 (2006)
doi:10.1021/jm060236z
PMid:16722646

148. D. Wei, X. Jiang, L. Zhou, J. Chen, Z. Chen, C. He, K. Yang, Y. Liu, J. Pei and L. Lai: Discovery of Multitarget Inhibitors by Combining Molecular Docking with Common Pharmacophore Matching. Journal of Medicinal Chemistry, 51(24), 7882-7888 (2008)
doi:10.1021/jm8010096
PMid:19090779

149. E. Jenwitheesuk, J. A. Horst, K. L. Rivas, W. C. Van Voorhis and R. Samudrala: Novel paradigms for drug discovery: computational multitarget screening. Trends in Pharmacological Sciences, 29(2), 62-71 (2008)
doi:10.1016/j.tips.2007.11.007
PMid:18190973

150. G. M. Verkhivker, D. Bouzida, D. K. Gehlhaar, P. A. Rejto, S. Arthurs, A. B. Colson, S. T. Freer, V. Larson, B. A. Luty, T. Marrone and P. W. Rose: Deciphering common failures in molecular docking of ligand-protein complexes. J Comput Aided Mol Des, 14(8), 731-51 (2000)
doi:10.1023/A:1008158231558
PMid:11131967

151. B. K. Shoichet, S. L. McGovern, B. Wei and J. J. Irwin: Lead discovery using molecular docking. Curr Opin Chem Biol, 6(4), 439-46 (2002)
doi:10.1016/S1367-5931(02)00339-3

Abbreviations: CADD: Computer-Assisted Drug Design; LBDD: Ligand-Based Drug Design; SBDD: Structure-based Drug Design; PDB: Protein Data Bank; GA: Genetic Algorithm; MC: Monte Carlo; MD: Molecular Dynamics; MRC: Multiple Receptor Conformations; RCM: Relaxed Complex Scheme; MM-PBSA: Molecular Mechanics Posisson Boltzmann Surface Area; RMSD: Root Mean Square Deviation; PMF: Potential of Mean Force.

Key Words: Docking, structure-based drug design, receptor, ligand docking, protein docking, ensemble docking, receptor flexibility, cross docking, MRC, scoring function, consensus scoring, review.

Send correspondence to: Giovanni Bottegoni, Dept. of Drug Discovery and Development, Istituto Italiano di Tecnologia, Address: via Morego n.30 16163 Genova, Italy, Tel: 39 010 71781522, Fax: 39 010 7170187, E-mail:giovanni.bottegoni@iit.it