The epidemiological surveillance of SARS-CoV-2 by means of whole-genome sequencing has revealed the emergence and co-existence of multiple viral lineages or subtypes throughout the world. Moreover, it has been shown that several subtypes of this virus display particular phenotypes, such as increased transmissibility or reduced susceptibility to neutralizing antibodies, leading to the denomination of Variants of Interest (VOI) or Variants of Concern (VOC). Thus, subtyping of SARS-CoV-2 is a crucial step for the surveillance of this pathogen. Here, we present Covidex, an open-source, alignment-free machine learning subtyping tool. It is a shiny web app that allows an ultra-fast and accurate classification of SARS-CoV-2 genome sequences into the three most used nomenclature systems (GISAID, Nextstrain, Pango lineages). It also categorizes input sequences as VOI or VOC, according to current definitions. The program is cross-platform compatible and it is available via Source-Forge https://sourceforge.net/projects/covidex or via the web application http://covidex.unlu.edu.ar.
Keywords: Machine learning; SARS-CoV-2; Subtyping; VOC; VOI; Web-application.
Copyright © 2022 The Authors. Published by Elsevier B.V. All rights reserved.