CliqueMS: a computational tool for annotating in-source metabolite ions from LC-MS untargeted metabolomics data based on a coelution similarity network

Oriol Senan; Antoni Aguilar-Mogas; Miriam Navarro; Jordi Capellades; Luke Noon; Deborah Burks; Oscar Yanes; Roger Guimerà; Marta Sales-Pardo

doi:10.1093/bioinformatics/btz207

CliqueMS: a computational tool for annotating in-source metabolite ions from LC-MS untargeted metabolomics data based on a coelution similarity network

Bioinformatics. 2019 Oct 15;35(20):4089-4097. doi: 10.1093/bioinformatics/btz207.

Authors

Oriol Senan¹, Antoni Aguilar-Mogas¹, Miriam Navarro^{2

3}, Jordi Capellades^{2

3}, Luke Noon^{3

4}, Deborah Burks^{3

4}, Oscar Yanes^{2

3}, Roger Guimerà^{1

5}, Marta Sales-Pardo¹

Affiliations

¹ Department of Chemical Engineering, Universitat Rovira i Virgili, Tarragona, Spain.
² Department of Electronic Engineering, Metabolomics Platform, IISPV, Universitat Rovira i Virgili, Tarragona, Spain.
³ CIBER of Diabetes and Associated Metabolic Diseases (CIBERDEM), Madrid, Spain.
⁴ Centro de Investigación Príncipe Felipe, Valencia, Spain.
⁵ ICREA, Barcelona, Spain.

Abstract

Motivation: The analysis of biological samples in untargeted metabolomic studies using LC-MS yields tens of thousands of ion signals. Annotating these features is of the utmost importance for answering questions as fundamental as, e.g. how many metabolites are there in a given sample.

Results: Here, we introduce CliqueMS, a new algorithm for annotating in-source LC-MS1 data. CliqueMS is based on the similarity between coelution profiles and therefore, as opposed to most methods, allows for the annotation of a single spectrum. Furthermore, CliqueMS improves upon the state of the art in several dimensions: (i) it uses a more discriminatory feature similarity metric; (ii) it treats the similarities between features in a transparent way by means of a simple generative model; (iii) it uses a well-grounded maximum likelihood inference approach to group features; (iv) it uses empirical adduct frequencies to identify the parental mass and (v) it deals more flexibly with the identification of the parental mass by proposing and ranking alternative annotations. We validate our approach with simple mixtures of standards and with real complex biological samples. CliqueMS reduces the thousands of features typically obtained in complex samples to hundreds of metabolites, and it is able to correctly annotate more metabolites and adducts from a single spectrum than available tools.

Availability and implementation: https://CRAN.R-project.org/package=cliqueMS and https://github.com/osenan/cliqueMS.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Chromatography, Liquid
Ions
Metabolomics
Neural Networks, Computer
Software*
Tandem Mass Spectrometry*

Substances

Ions