Prepared first by
Dr. Didier Rognan
Adapted by Dr. G. Schaftenaar
Aim : Virtual screening of a chemical database to identify potential hits
Target : Estrogen receptor (ERa)
A number of X-ray structures of the ligand binding domain of ER-a are available
(at least two containing antagonists: raloxifene, and 4-hydroxy-tamoxifen).
Database : 32 small molecular weight molecules
Method: Flexible docking of the 32 ligands using the SYBYL/FlexX interface.
- - 25 random molecules
- (from the National Cancer Institute database)
- - 6 known ERa-antagonists
(raloxifene, 4-OH-tamoxifen, ICI-164384, nafoxidene, LY-326315, EM-343)
- - dihydroxy-tamoxifen
(a ligand designed in analogy to 4-OH-tamoxifen and estradiol, the natural ligand)
For more details on FlexX, see :
Rarey, M.; Kramer, B.; Lengauer, T.; Klebe, G. J. Mol. Biol. 1996, 261, 470-89.
Rarey, M.; Kramer, B.; Lengauer, T. Proteins 1999, 34, 17-28.
Briefly, FlexX is a flexible docking tool using an incremental construction algorithm that first
places a base fragment in the active site and then extends it to peripheral fragments according to the
most favorable torsion angles (intramolecular energy) and protein-ligand interactions
- Docking accuracy (root-mean square deviations of the FlexX pose from the X-ray solution)
- Ranking accuracy, according to seven different scoring functions (FlexX, Gold, Pmf, Dock,
- Possible use of consensus lists
- Hit rate in the top 5 ligands according to single or consensus scoring
Fig.1 Known ERa antagonists in the database
If you are working from a Windows PC, first read how to setup a X-windows Session with our main Unix machine cmbi6.cmbi.kun.nl.
From the UNIX shell, change directory to data/bioinf4/docking by typing
> cd data/bioinf4/docking
> sybyl The SYBYL menu appears
1. Loading the X-ray structure of the ERa receptor
File >> Read >>
A Read File window appears (Fig.2)
||Default directory from which SYBYL|
has been started
Fig.2 : Read File window
Select 3ert.mol2 in the upper right menu and click ok.
The X-ray structure of the ligand binding domain of the ERa receptor in complex with 4-hydroxy-
tamoxifen is loaded on the screen. Please note that no hydrogen atoms are present. Carbon atoms
of the ligand are colored in green. Red crosses indicate positions of crystallographic water
2. Extracting the coordinates of the active site.
Extracting the coordinates of the active site is a prerequisite to any virtual screening, in order to
reduce the number of docking solutions for each ligand.
Define the active site as the collection of amino acids lying within 6.5 Å of any ligand atom.
Build/Edit >> Merge >>
The Atom Expression window appears on the|
1. Move from atoms to Monomers in the
upper left menu
2.In the Substructures menu, select
OHT600 after browsing the proposed list and
3.Click the Sets menu, activate the Sphere
option, give 6.5 as radius, and confirm by
Fig.3 Atom Expression window
Choose m2 as molecule area into which selected atoms are to be merged, and click ok.
The active site (in red) is superimposed to the whole protein. To display the active site alone, select
the Display area option in the left panel menu (Fig.4)
Fig.4 Left panel menu
Select in the Display Area menu (left menu on the screen) and undisplay|
the full protein by disabling the On/Off option (D1 row) of the Mol
Display submenu. Quit this window by clicking the Q button.
The active site (in red) with the bound ligand (in green) is now visible.
Give a name to the active site :
Built/Edit >> Modify >> Molecule >> Name >>m2 >> active_site. Click ok.
Remove the ligand and save the coordinates of the active site : Build/Edit >>Delete >>Atom
A new Atom Expression window is displayed (Fig.5)
Fig.5. Atom Expression window
1. Unselect m1 (by clicking the row), select m2|
as the molecule area to work with.
2. Browse the Substructures menu and select
OHT600, Confirm the selection by clicking ok
3. click ok to close the window
The ligand has been removed. Now, we save the coordinates of the active site.
File >> Save As
A Save Molecule window is displayed (Fig. 6)
Fig.6 Save Molecule window
1. Unselect m1 (by clicking the row),|
select m2 as the molecule area to work with.
2. Choose the PDB format for
3. give a name to the file to save :
4. click on save
Display the full protein again (recall Fig.4) and delete all molecules from the screen
Build/Edit >> Zap(Delete) Molecule >> All >> ok
Using FlexX, we will dock a small database of 32 compounds (25 random, 5 known antagonists)
into the ERa active site. For each ligand, a single low-energy conformation has been previously
defined by converting 2D into 3D coordinates using CORINA (Gasteiger et al. Tetrahedron Comp.
Method. 1990, 3, 537-547)
Tools >> Docking Suite>> Dock Ligands
The FlexX window is displayed after a few seconds (Fig.7)
Fig.7. FlexX window
Now create a receptor description file(rdf) :
click on the Create... button, give a name for the file (e.g. er) and confirm by clicking ok. A new
«Create rdf file » window appears (Fig. 8)
1. PDB Filename: click the ... button, select 3ert.pdb in the|
« Files » window. Confirm the selection by clicking ok
2. Active-Site File: click the ... button, select active_site.pdb in the
« Files » window. Confirm the selection by clicking ok
3. click OK for exiting the window
Fig.8 Create RDF File window
After a few seconds, we come back to the previous
FlexX window (Fig.7). Please note that the name given to the RDF file is now selected (Fig.9)
1. Ligands from:Select the database file type (mol2).|
2. click the
... button, select
database.mol2 in the « Files » window.
Confirm the selection by clicking ok.
3. Assign formal charges
4. Activate the « FlexX Details »
window and modify the default Maximum Number of
Poses per Ligand (from 30 to 1). This means that
only the top solution will be saved for each
Confirm by clicking ok.
5. activate the Netbatch mode
6. Step 6 should submit the job. For
saving time (docking 32 ligands requires ca.
1h cpu time), we assume that the job has
been submitted and we will directly analyze
Fig.9 FlexX window
Exit the FlexX window by selecting
Tools >> Docking Suite>> Analyze results
The FlexX answer browser window is displayed (Fig. 10)
a) FlexX score
click the ... button, select dbflexx as jobfile in the « Sub-Directories » window. Confirm the
selection by clicking ok. The window is updated (Fig. 11) and the list of docked ligands with their
binding energy score (FlexX score) is given
| Fig.10 FlexX answer browser window|
| Fig.11 Updated FlexX answer browser window|
Table 10. FlexX Drugscore ranking of the 31 docked ligands
- - For all but one of the 32 ligands, a docking solution has been found. By clicking the
Show Failed Ligands ... button, a list of 1 ligands that FlexX failed to dock in the ERa active
site is displayed. Close the window.
- - Clicking the Scores icon ranks the 31 docked ligands according to the FlexX score (in kJ/mol)
(Toggle to switch between ascending and descending order)
|Rank ||Ligand ||FlexX Score
| 1|| NSC__147505 || -81.9
| 2|| RALOXIFENE || -72.0
| 3|| 4_hydroxy_tamoxifen || -66.4
| 4|| EM_343 || -66.2
| 5|| ICI_164384 || -63.3
| 6|| dihydroxy_tamoxifen || -59.9
| 7|| LY_326315 || -57.1
| 8|| NSC__506431 || -49.9
| 9|| NAFOXIDENE || -48.5
|10|| NSC__152522 || -46.0
|11|| NSC__131754 || -44.5
|12|| NSC__88579 || -41.3
|13|| NSC__74751 || -39.0
|14|| NSC__46215 || -38.4
|15|| NSC__2 || -38.3
|16|| NSC__240424 || -38.1
|17|| NSC__102240 || -36.9
|18|| NSC__618129 || -36.9
|19|| NSC__658337 || -34.9
|20|| NSC__679529 || -34.5
|21|| NSC__208922 || -34.3
|22|| NSC__346517 || -34.1
|23|| NSC__176927 || -32.6
|24|| NSC__163127 || -28.7
|25|| NSC__60047 || -28.5
|26|| NSC__118161 || -28.1
|27|| NSC__636713 || -25.5
|28|| NSC__34379 || -23.9
|29|| NSC__276435 || -23.5
|30|| NSC__382147 || -18.4
|31|| NSC__703010 || -16.3
Please note that 6 out the 6 known ERa antagonists are amongst the top 9 positions
Dihydroxy_tamoxifen is ranked 6th.
Nafoxidene scores one place below NSC_506431. Nafoxidene has a protected
hydroxyl, and needs to be metabolised before it becomes active.
Nafoxidene in this pro-drug state is not be expected to rank very high.
We will see in a few minutes whether other scoring functions may improve on this result.
Close the FlexX answer browser window.
b) Docking accuracy
For two ligands (raloxifene, 4-oh-tamoxifen), a protein-ligand X-ray structure is available.
We can then compare the FlexX pose with the X-ray solution.
Load the protein active site:
File >> Read >> active_site.pdb >> No (Do not center the molecule at screen)
Load the ERa-bound X-ray structure of 4-oh-tamoxifen:
File >> Read >> 4_oh_tamoxifen_xray_pdb >> No
Load the FlexX solution for 4-oh- tamoxifen:
File >> Read >>
Select dbflexx, then 4_hydroxy_tamoxifen.mdb in the upper « Sub-directories » menu and then 4-hydroxy-tamoxifen_001.mol2 in the right « Files » menu.
You can color the ligands by:
View >> Color >> Atoms.. >> Select Molecular Area >> All >> OK >> Choose a Color
|Look at analogies and differences between the two poses.
Answer the following questions for the above-described compounds :
- - Is the FlexX conformation similar to the protein-bound X-ray structure ?
- - Is the FlexX orientation in agreement with the X-ray solution ?
- - Could the FlexX poses be used for lead optimization ?
c) Re-scoring all hits
The 31 hits docked by FlexX will be rescored by 4 other scoring functions:
- Dock (Ewing et al., J. Comput. Chem. 1997, 18, 1175-1189)
- Gold (Jones et al., J. Mol. Biol. 1997, 267, 727-748)
- Pmf (Muegge et al., J. Med. Chem. 1999, 42, 379384)
- Chemscore (Eldridge et al. J. Comput-Aided Mol. Des. 1997, 11,425-445)
Dock and Gold use a force-field energy decomposition for calculating interaction energies whereas
Chemscore and FlexX belong to the category of empirical free energy scoring
functions (energy decomposition into various scores to which a coefficient has been assigned).
Pmf uses a statistical potential of mean force.
- Rescoring with the CScoreTM module of Sybyl
The CScoreTM option allows to compute FlexX (F_score), Dock (D-score), Gold (G score), PMF
(P_score) and Chemscore (C_score) scores from a table where the FlexX docked conformations
had previously been saved.
Delete all molecules from the screen:
Build/Edit >> Zap (Delete) Molecule
Load the table:
File >> Molecular Spreadsheet >> New >> Database >>
Select hits.mdb in the right window where all available databases are listed.
Confirm by clicking the Open button.
A new spreadsheet entitled HITS is displayed on the screen (Fig. 11)
Fig.11 Sybyl Molecular Spreadsheet
The 31 molecules docked by FlexX are here listed by alphabetical order. To run CScore :
type at the SYBYL shell:
|Sybyl>> cscore[ENTER] ||!! Warning
|Associated receptor mol2 file: protein.mol2 ||Commands to execute in the
|Row expression : *[ENTER}||bottom shell
5 scores (F_score, D_score, G_score, P_score ) have been computed for each row (Fig. 12).
Fig.12. 5 new scores for the 31 potential ligands
The CScore value is a consensus score (from 0 to 5) indicating whether each
compound belongs to
the top scorers of individual lists (5 : always, 4 : 4 out of 5 lists, etc..). Using this simple consensus scoring,
4 out of the 6 known ligands would have been selected
using cscore value 4 or more. Note that dihydroxytamoxifen with a cscore of
4 would be classified as a new potential drug.
Note that the other 2 known active compounds and also some of the random
ligands have a cscore of 3 and are therefore promissing drug candidates.
One of these random ligands is NSC_152522 (Fig. 13).
Fig. 13. NSC_152522
To look at the proposed docked conformation, select the compound in the spreadsheet (click the
corresponding row) and display the molecule at the screen:
File (Spreadsheet menu) >> Put Rows into Molecule Areas
|The docked conformation of NSC_152522 is displayed. Load the protein and 4-oh-|
|tamoxifen_xray as references (recall earlier). The two phenolic groups are H-bonded to Glu353 and|
|Thr347 side chains and fit very well the same two moieties of 4-oh-tamoxifen. The 2 aromatic|
|rings are also well superimposed. Thus, this new compound might be a putative true hit. You can|
|label the protein residues by View >> Label >> Substructure >> All|
Look at the order of each list :
View (Spreadsheet menu) >> Sort >>
Select a column primary input (e.g. PMF_SCORE)
Select the rank order : Ascending
The table is updated according to the PMF_score ranking (Fig. 15).
The PMF_score is the score corresponding (but NOT equal) to the Drugscore we used to generate the docking poses.
Fig. 15 Table listed according to PMF_score
(original FlexX Drugscore scores, recall Table 1)
Look at the performance of the five scoring functions (ranking of known ligands )
Are different scores correlated ?
To see the possible correlations between different scores, plot one score versus another one.
Graph (Spreadsheet menu) >> Scatter
X axis :any of the 7 scores
Y axis : any of the seven scores
Z axis : omit
Color : uniform
Press the Create button.
For most of all possible 2D plots, 4 out of the 5 known ligands are well separated from the random
pool. By selecting the Pick Points option in the Spreadsheet, you can click any point and identify
the corresponding hit as well as its structure (Fig.16)
Fig. 15 Scatter plot
Delete all molecules and backgrounds:
Built/Edit >> Zap (Delete) Molecule
View >> Delete All Backgrounds
File (Spreadsheet menu) >> Close >> No
Sybyl > exit [ENTER]
!! Warning: Command to execute in the bottom blue shell
FlexX can accurately dock ligands in the binding site of the ERa receptor and discriminate 6 out
of 6 known ligands from a random pool of small molecular-weight molecules.