Motivation: In cancer, chromosomal imbalances like amplifications and deletions, or changes in epigenetic mechanisms like DNA methylation influence the transcriptional activity. These alterations are often not limited to a single gene but affect several genes of the genomic region and may be relevant for the disease status. For example, the ERBB2 amplicon (17q21) in breast cancer is associated with poor patient prognosis. We present a general, unsupervised method for genome-wide gene expression data to systematically detect tumor patients with chromosomal regions of distinct transcriptional activity. The method aims to find expression patterns of adjacent genes with a consistently decreased or increased level of gene expression in tumor samples. Such patterns have been found to be associated with chromosomal aberrations and clinical parameters like tumor grading and thus can be useful for risk stratification or therapy.
Results: Our approach was applied to 12 independent human breast cancer microarray studies comprising 1422 tumor samples. We prioritized chromosomal regions and genes predominantly found across all studies. The result highlighted not only regions which are well known to be amplified like 17q21 and 11q13, but also others like 8q24 (distal to MYC) and 17q24-q25 which may harbor novel putative oncogenes. Since our approach can be applied to any microarray study it may become a valuable tool for the exploration of transcriptional changes in diverse disease types.
Availability: The R source codes which implement the method and an exemplary analysis are available at http://www.dkfz.de/mga2/people/buness/CTP/.
Supplementary information: Supplementary data are available at Bioinformatics online.