Several tools use prior biological knowledge to interpret gene expression data. However, existing enrichment tools assume that variables are monotonic and incorrectly measure the distance between periodic phases. As a result, these tools are poorly suited for the analysis of the cell cycle, circadian clock, or other periodic systems. Here, we develop Phase Set Enrichment Analysis (PSEA) to incorporate prior knowledge into the analysis of periodic data. PSEA identifies biologically related gene sets showing temporally coordinated expression. Using synthetic gene sets of various sizes generated from von Mises (circular normal) distributions, we benchmarked PSEA alongside existing methods. PSEA offered enhanced sensitivity over a broad range of von Mises distributions and gene set sizes. Importantly, and unlike existing tools, the sensitivity of PSEA is independent of the mean expression phase of the set. We applied PSEA to 4 published datasets. Application of PSEA to the mouse circadian atlas revealed that several pathways, including those regulating immune and cell-cycle function, demonstrate temporal orchestration across multiple tissues. We then applied PSEA to the phase shifts following a restricted feeding paradigm. We found that this perturbation disrupts intraorgan metabolic synchrony in the liver, altering the timing between anabolic and catabolic pathways. Reanalysis of expression data using custom gene sets derived from recent ChIP-seq results revealed circadian transcriptional targets bound exclusively by CLOCK, independently of BMAL1, differ from other exclusive circadian output genes and have well-synchronized phases. Finally, we used PSEA to compare 2 cell-cycle datasets. PSEA increased the apparent biological overlap while also revealing evidence of cell-cycle dysregulation in these cancer cells. To encourage its use by the community, we have implemented PSEA as a Java application. In sum, PSEA offers a powerful new tool to investigate large-scale, periodic data for biological insight.
Keywords: Kuiper; biological rhythms; cell cycle; circadian; gene expression; oscillatory; overrepresentation; periodic; phase set enrichment analysis.
© 2016 The Author(s).