Selection of a phylogenetically informative region of the norovirus genome for outbreak linkage

Virus Genes. 2012 Feb;44(1):8-18. doi: 10.1007/s11262-011-0673-x. Epub 2011 Sep 30.

Abstract

The recognition of a common source norovirus outbreak is supported by finding identical norovirus sequences in patients. Norovirus sequencing has been established in many (national) public health laboratories and academic centers, but often partial and different genome sequences are used. Therefore, agreement on a target sequence of sufficient diversity to resolve links between outbreaks is crucial. Although harmonization of laboratory methods is one of the keystone activities of networks that have the aim to identify common source norovirus outbreaks, this has proven difficult to accomplish, particularly in the international context. Here, we aimed at providing a method enabling identification of the genomic region informative of a common source norovirus outbreak by bio-informatic tools. The data set of 502 unique full length capsid gene sequences available from the public domain, combined with epidemiological data including linkage information was used to build over 3,000 maximum likelihood (ML) trees for different sequence lengths and regions. All ML trees were evaluated for robustness and specificity of clustering of known linked norovirus outbreaks against the background diversity of strains. Great differences were seen in the robustness of commonly used PCR targets for cluster detection. The capsid gene region spanning nucleotides 900-1,400 was identified as the region optimally substituting for the full length capsid region. Reliability of this approach depends on the quality of the background data set, and we recommend periodic reassessment of this growing data set. The approach may be applicable to multiple sequence-based data sets of other pathogens.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Caliciviridae Infections / epidemiology
  • Caliciviridae Infections / virology*
  • Capsid Proteins / genetics
  • Computational Biology / methods*
  • Disease Outbreaks
  • Genetic Linkage*
  • Genome, Viral*
  • Genotype
  • Humans
  • Molecular Sequence Data
  • Netherlands / epidemiology
  • Norovirus / classification*
  • Norovirus / genetics*
  • Norovirus / isolation & purification
  • Phylogeny*
  • United States / epidemiology

Substances

  • Capsid Proteins