Conserved Co-expression for Candidate Disease Gene Prioritization - Web Data
This website provides data sets for the paper:
Oti M., Van Reeuwijk J., Huynen M.A., Brunner H.G.: Conserved Co-expression for Candidate Disease Gene Prioritization
See the paper for further details.
Supplementary Data
These are the conserved co-expression data sets as well as the disease gene prioritization
(ranking) results for the analyses presented in the paper. The conserved co-expression data
sets are based on the euKaryotic clusters of Orthologous Groups (KOGs), which contains
co-expression scores averaged over genes between each KOG pair.
The expression data and calculation procedures used are described in the article.
Conserved co-expression data sets
These are the co-expression Spearman correlation coefficient scores between KOG pairs based on
human-only and conserved co-expression. In addition to the co-expression Spearman correlation
coefficients derived from the GEO and GNF expression datasets, those based on the
Stuart et al. 2003 expression data (see article for further details) are also included.
The pairwise conserved co-expression data sets are not included due to space limitations.
The files are gzipped tab-separated text files. The fields are:
- KOG 1
- KOG 2
- Mean Spearman correlation coefficient for KOG pair across all species in data set
- Data sets used, space-separated. Format: <species>_<platform>
Data sets:
- Human only (38MB)
- Human, Fly, Worm, Yeast:
(one or more of four species) (77MB)
- Human, Fly, Worm, Yeast
(Stuart et al. data): (one or more of four species) (65MB)
- Human, Mouse, Fly, Worm, Yeast:
(one or more of five species) (80MB)
Candidate disease gene ranking results
- Disease gene rankings based on KOG-KOG
co-expression (conserved co-expression): This is a zipped file containing the
disease gene rankings for the disease loci based on co-expression between KOGs (i.e.
conserved co-expression). The disease gene ranking results for the pairwise species
comparisons are also included. The filenames indicate the species from which the
conserved co-expression data sets used were derived (Hs, Dm, Ce and Sc for human,
fly, worm and yeast respectively). The disease genes are represented by the KOG ID of
the KOG to which they belong, as these results are based on KOG-KOG co-expression.
In each of the files the columns are:
- Disease name
- Known disease gene (KOG ID)
- Target candidate locus
- Correct disease gene in target candidate locus (KOG ID)
- Number of genes that could be mapped to KOGs in candidate locus
- Relative rank of correct disease gene in target candidate locus
Supporting data files
- KOG-to-gene mappings: This is a zipped file
containing the KOG-to-gene mappings, both for the expression sets used in this study
and for those from the Stuart et al. 2003 study. The gene IDs used are those from the
original data sets (thus human genes are represented by HGNC gene symbols for the
GNF human expression data set, and by RefSeq IDs for the Stuart et al. data set).
- Disease loci: This is a zipped file containing the
disease loci used in this study. Both genes and their corresponding KOGs are provided
in separate files, as a few analyses used gene co-expression while most used KOG
co-expression. In each of the files the columns are:
- Disease name
- Known disease gene (or equivalent KOG ID)
- Target candidate locus
- Genes within target candidate locus (or equivalent KOG IDs)
Last updated: 3 April 2008
by Martin Oti. E-mail address: M.Oti at cmbi.ru.nl