High-throughput transcriptome sequencing allows identification of cancer-related changes that occur at the stages of transcription, pre-messenger RNA (mRNA), and splicing. In the current study, we devised a pipeline to predict novel alternative splicing (AS) variants from high-throughput transcriptome sequencing data and applied it to large sets of tumor transcriptomes from The Cancer Genome Atlas (TCGA). We identified two novel tumor-associated splice variants of matriptase, a known cancer-associated gene, in the transcriptome data from epithelial-derived tumors but not normal tissue. Most notably, these variants were found in 69% of lung squamous cell carcinoma (LUSC) samples studied. We confirmed the expression of matriptase AS transcripts using quantitative reverse transcription PCR (qRT-PCR) in an orthogonal panel of tumor tissues and cell lines. Furthermore, flow cytometric analysis confirmed surface expression of matriptase splice variants in chinese hamster ovary (CHO) cells transiently transfected with cDNA encoding the novel transcripts. Our findings further implicate matriptase in contributing to oncogenic processes and suggest potential novel therapeutic uses for matriptase splice variants.
Keywords: RNA sequencing; alternative splicing; de novo assembly; epithelial tumors; matriptase.