ADME Tutorial: Lipinski's rules of five

By Scott Lusher, Lars Ridder and G. Schaftenaar

ADME stands for Absorption, Distribution, Metabolism and Excretion. All xenobiotics (molecules that are alien to your body) are affected by these natural processes in the human body. Absorption determines if the drug gets into the body (passing of the gastro-intestinal tract), distribution determines if the drug can arrive at the place of action, metabolism determines whether the drug is broken down before it can arrive at place of action and excretion determines whether the drug is gone before / too soon after it can become active.
ADME plays an ever increasingly important role in the drug design process. Where traditionally ADME tools were applied at the end of the drug development pipeline, nowadays ADME is applied at an early stage in drug design, in order to remove molecules with poor ADME properties from the drug development pipeline.




The goal of this tutorial is to establish Lipinski's rule of five, which predicts that poor absorption and permeability of potential drug candidates will occur if:

- number of H-bond donors > 5
- number of H-bond acceptors > 10
- Molecular Weight > ?
- clogP > ?

Where ClogP is the calculated log octanol/water partition coefficient, which is a measure for the lipophilicity of a compound. The logP is an important physicochemical parameter for oral absorption, since it relates to solubility and influences the ability of a compound to permeate through cel membranes including those of the intestinal epithelial cells. Too hydrophilic compounds (negative logP) are not able to pass through membranes, as they hardly enter the hydrophobic interior (mimicked by the octanol phase in the octanol/water system) of the lipophilic bilayer. Too lipophilic compounds (high logP) tend to be insoluble and also poorly permeate through membranes, as they get stuck in the lipophilic bilayer. Molecular weight is an important parameter indicating the size of the molecule. Too large molecules have difficulty to diffuse through membranes.

In this practical we will derive the favorable ranges for the Molecular weight and ClogP (the last two rules of Lipinski's rule-of-5) ourselves.

You will have a look at the distribution of the CLOGP in a database of drugs, compiled from the WDI (World Drug Index).


Setup the working environment

If necessary, check the X-windows start-up page for detailed instructions on how to set up the X-windows environment and to access the CMBI's main Unix machine, cheminf.cmbi.kun.nl. Then, from the Unix shell (command prompt):




Read in the database of compounds used by Lipinski to derive the rule-of-5 (see the paper from 1997):

File >> Molecular Spreadsheet >> Open >> Lipinski.tbl

Next we have to have to add to the spreadsheet a column with calculated CLOGP values:

In the spreadsheet window: select a new column >> Autofill >> CLOGP >> OK

Two new columns will appear, named CLOGP and CLOGP_ERR. The CLOGP_ERR column contains an error code which indicates possible problems encountered during the calculation of the CLOGP value for the molecules.

Create a histogram of the CLOGP distribution of this database:

Select column CLOGP, Graph >> Histogram >> Bin Range -8 .. 10 NBins = 18 >> Bin Method: Fraction >> activate List Bin Values >> Create
Where: D1, Title: CLOGP

ADME filter window

Look at the distribution and the fractions given for each bin (reported in the console window). Now choose an upper limit, such that about 10% of the compounds fall above this limit. In Lipinski's study only an upper limit for LogP was considered. Would there be a good reason for looking at the lower limit as well ? What value would you choose for the lower limit, e.g. such that between 5% and 10% of the compounds fall below this limit ?"




Now clear all histograms:

View >> Delete All Backgrounds

Secondly you will look at the distribution of the molecular weight in the database of known drugs.

First we have to add to the spreadsheet a column with calculated molecular weight values:

In the spreadsheet window: select a new column >> Autofill >> MOL_WEIGHT >> OK

Create a histogram of the molecular weight distribution of this database:

Select column MOL_WEIGHT, Graph >> Histogram >> Bin Range 0-1000 NBins = 10 >> Bin Method: Fraction >> activate List Bin Values >> Create
Where: D2, Title: MW

ADME filter window

Look at the distribution and the fractions given for each bin (given in the console window). Now choose an upper limit, such that between 5% and 10% of the compounds fall above this limit.

To see the two histograms separately:

Display Options icon (left column main Sybyl window, third from the top) >> in the Display Options window, select Half >> Q

After you have looked at these histograms, restore the old situation:

Display Options icon >> in the Display Options window, select Full >> Q





Now let's use the just established rules to filter out molecules with poor ADME properties out of a mixed database of non-drugs and a few known drugs.
Let's see how many of the known drugs for the estrogen receptor (also known as SERM's, Selective Estrogen Receptor Modulators) we can retrieve by applying different filters. First let's start with a clean Sybyl, close all windows and delete all molecules in memory:

Close the spreadsheet windows by File >> Close >> No.
Delete all molecules: Build/Edit >> Zap (Delete) Molecule >> All

Now let's read in the mixed database:

File >> Molecular Spreadsheet >> Open >> 500.tbl

ADME filter window

Now apply the filter:

Tools (In the top sybyl menubar) >> Selector >> Compound Filtering ....
The Compound Filtering window will have opened:

ADME filter window

Change Input source to Table, check that the database 500 is selected. Now specify names for the two result sets, the molecules that pass the filter and the molecules that are filtered out:

Put excluded compounds in: excluded
Put passing compounds in: passed

Change the corresponding file types to Hitlist.

Specify the following filter ranges:

mol. weight: the upper limit you have derived
ClogP:       the upper and lower limits you have derived
H donors:    0 - 5
H acceptors: 0 - 10

Click OK

The filtering is done when you see the text dbslnfilter complete in the Sybyl textport. Write down the number of compounds filtered out.

Now lets have a look at compounds filtered out:

File >> Molecular Spreadsheet >> Open >> select format UNITY Hitlist >> select excluded

Identify which of the SERM's where excluded.

select row with SERM >> Show Row Selected.

Can you determine which of the filters we specified this SERM did not satisfy ?

Read the answer and explanation here.

We shall now repeat the filtering with CLogD instead of CLogP.




ClogP is often used when looking at large databases because it is very fast to calculate. However, the lipophilicity of ionizable molecules really depends on the charge at the relevant pH, which is accounted for in LogD but not LogP.
Sybyl does not have the capability to calculate the CLogD. In addition the compound filtering in Sybyl can not filter on CLogD. So we must use an alternative way to filter on CLogD:

Read in a new spreadsheet containing the CLogD data, calculated at pH 6.4:

File >> Molecular Spreadsheet >> Open >> 500_logD.tbl

Mixed databse with clogD data

In the spreadsheet:

Select Rows >> Select Rows by: Range >> Column: ClogD; From -1 .. 5 >> Add
Column: Molecular Weight; From 0 .. 500 >> Refine
Column: Acceptor Count; From 0 .. 10 >> Refine
Column: Donor Count; From 0 .. 5 >> Refine

Check if the SERMS are included in the selection.

What is the most "critical" property for oral absorption in this dataset ?

What is the effect of ionization on the LogD ?

Which pH would be most relevant for oral absorption ?