Using BioPAX-Parser (BiP) to enrich lists of genes or proteins with pathway data

BMC Bioinformatics. 2021 Sep 30;22(Suppl 13):376. doi: 10.1186/s12859-021-04297-z.

Abstract

Background: Pathway enrichment analysis (PEA) is a well-established methodology for interpreting a list of genes and proteins of interest related to a condition under investigation. This paper aims to extend our previous work in which we introduced a preliminary comparative analysis of pathway enrichment analysis tools. We extended the earlier work by providing more case studies, comparing BiP enrichment performance with other well-known PEA software tools.

Methods: PEA uses pathway information to discover connections between a list of genes and proteins as well as biological mechanisms, helping researchers to overcome the problem of explaining biological entity lists of interest disconnected from the biological context.

Results: We compared the results of BiP with some existing pathway enrichment analysis tools comprising Centrality-based Pathway Enrichment, pathDIP, and Signaling Pathway Impact Analysis, considering three cancer types (colorectal, endometrial, and thyroid), for a total of six datasets (that is, two datasets per cancer type) obtained from the The Cancer Genome Atlas and Gene Expression Omnibus databases. We measured the similarities between the overlap of the enrichment results obtained using each couple of cancer datasets related to the same cancer.

Conclusion: As a result, BiP identified some well-known pathways related to the investigated cancer type, validated by the available literature. We also used the Jaccard and meet-min indices to evaluate the stability and the similarity between the enrichment results obtained from each couple of cancer datasets. The obtained results show that BiP provides more stable enrichment results than other tools.

Keywords: Biological pathway; Pathway databases; Pathway enrichment analysis; Statistical analysis.

MeSH terms

  • Computational Biology
  • Databases, Factual
  • Gene Expression Profiling
  • Humans
  • Neoplasms* / genetics
  • Proteins / genetics
  • Signal Transduction
  • Software*

Substances

  • Proteins