Establishing the timeframe when a particular virus was circulating in a population could be useful in several areas of biomedical research, including microbiology and legal medicine. Using simulations, we demonstrate that the circulation timeframe of an unknown SARS-CoV-2 genome in a population (hereafter, estimated time of a queried genome [QG]; tE-QG) can be easily predicted using a phylogenetic model based on a robust reference genome database of the virus, and information on their sampling dates. We evaluate several phylogeny-based approaches, including modeling evolutionary (substitution) rates of the SARS-CoV-2 genome (~10-3 substitutions/nucleotide/year) and the mutational (substitutions) differences separating the QGs from the reference genomes (RGs) in the database. Owing to the mutational characteristics of the virus, the present Viral Molecular Clock Dating (VMCD) method covers timeframes going backwards from about a month in the past. The method has very low errors associated to the tE-QG estimates and narrow intervals of tE-QG, both ranging from a few days to a few weeks regardless of the mathematical model used. The SARS-CoV-2 model represents a proof of concept that can be extrapolated to any other microorganism, provided that a robust genome sequence database is available. Besides obvious applications in epidemiology and microbiology investigations, there are several contexts in forensic casework where estimating tE-QG could be useful, including estimation of the postmortem intervals (PMI) and the dating of samples stored in hospital settings.
Keywords: SARS-CoV-2; forensic genetics; legal medicine; molecular clock; phylogeny; postmortem interval.