An improved distance measure between the expression profiles linking co-expression and co-regulation in mouse

Ryung S Kim; Hongkai Ji; Wing H Wong

doi:10.1186/1471-2105-7-44

An improved distance measure between the expression profiles linking co-expression and co-regulation in mouse

BMC Bioinformatics. 2006 Jan 26:7:44. doi: 10.1186/1471-2105-7-44.

Authors

Ryung S Kim¹, Hongkai Ji, Wing H Wong

Affiliation

¹ Department of Neurology, Harvard Medical School, Boston, MA 02115, USA. ryung_kim@dfci.harvard.edu

Abstract

Background: Many statistical algorithms combine microarray expression data and genome sequence data to identify transcription factor binding motifs in the low eukaryotic genomes. Finding cis-regulatory elements in higher eukaryote genomes, however, remains a challenge, as searching in the promoter regions of genes with similar expression patterns often fails. The difficulty is partially attributable to the poor performance of the similarity measures for comparing expression profiles. The widely accepted measures are inadequate for distinguishing genes transcribed from distinct regulatory mechanisms in the complicated genomes of higher eukaryotes.

Results: By defining the regulatory similarity between a gene pair as the number of common known transcription factor binding motifs in the promoter regions, we compared the performance of several expression distance measures on seven mouse expression data sets. We propose a new distance measure that accounts for both the linear trends and fold-changes of expression across the samples.

Conclusion: The study reveals that the proposed distance measure for comparing expression profiles enables us to identify genes with large number of common regulatory elements because it reflects the inherent regulatory information better than widely accepted distance measures such as the Pearson's correlation or cosine correlation with or without log transformation.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Algorithms*
Animals
Chromosome Mapping / methods*
Gene Expression Regulation / physiology*
Linkage Disequilibrium
Mice
Multigene Family / genetics*
Oligonucleotide Array Sequence Analysis / methods*
Promoter Regions, Genetic / genetics
Transcription Factors / chemistry*
Transcription Factors / metabolism*

Substances

Transcription Factors

Abstract

Publication types

MeSH terms

Substances

Grants and funding