Background: The advent of Next Generation Sequencing has allowed transcriptomes to be profiled with unprecedented accuracy, but the high costs of full-length mRNA sequencing have posed a limit on the accessibility and scalability of the technology. To address this, we developed 3'Pool-seq: a simple, cost-effective, and scalable RNA-seq method that focuses sequencing to the 3'-end of mRNA. We drew from aspects of SMART-seq, Drop-seq, and TruSeq to implement an easy workflow, and optimized parameters such as input RNA concentrations, tagmentation conditions, and read depth specifically for bulk-RNA.
Results: Thorough optimization resulted in a protocol that takes less than 12 h to perform, does not require custom sequencing primers or instrumentation, and cuts over 90% of the costs associated with TruSeq, while still achieving accurate gene expression quantification (Pearson's correlation coefficient with ERCC theoretical concentration r = 0.96) and differential gene detection (ROC analysis of 3'Pool-seq compared to TruSeq AUC = 0.921). The 3'Pool-seq dual indexing scheme was further adapted for a 96-well plate format, and ERCC spike-ins were used to correct for potential row or column pooling effects. Transcriptional profiling of troglitazone and pioglitazone treatments at multiple doses and time points in HepG2 cells was then used to show how 3'Pool-seq could distinguish the two molecules based on their molecular signatures.
Conclusions: 3'Pool-seq can accurately detect gene expression at a level that is on par with TruSeq, at one tenth of the total cost. Furthermore, its unprecedented TruSeq/Nextera hybrid indexing scheme and streamlined workflow can be applied in several different formats, including 96-well plates, which allows users to thoroughly evaluate biological systems under several conditions and timepoints. Care must be taken regarding experimental design and plate layout such that potential pooling effects can be accounted for and corrected. Lastly, further studies using multiple sets of ERCC spike-ins may be used to simulate differential gene expression in a system with known ground-state values.
Keywords: 3’Pool-seq; 3′-RNA sequencing; Differential gene expression; Next generation sequencing; RNA-seq; Transcriptomics.