Ancestral genetic exchange between members of many important bacterial pathogen groups has resulted in phylogenetic relationships better described as networks than as bifurcating trees. In certain cases, these reticulated phylogenies have resulted in phenotypic and molecular overlap that challenges the construction of practical approaches for species identification in the clinical microbiology laboratory. Burkholderia cepacia complex (Bcc), a betaproteobacteria species group responsible for significant morbidity in persons with cystic fibrosis and chronic granulomatous disease, represents one such group where network-structured phylogeny has hampered the development of diagnostic methods for species-level discrimination. Here, we present a phylogeny-informed proteomics approach to facilitate diagnostic classification of pathogen groups with reticulated phylogenies, using Bcc as an example. Starting with a set of more than 800 Bcc and Burkholderia gladioli whole-genome assemblies, we constructed phylogenies with explicit representation of inferred interspecies recombination. Sixteen highly discriminatory peptides were chosen to distinguish B. cepacia, Burkholderia cenocepacia, Burkholderia multivorans, and B. gladioli and multiplexed into a single, rapid liquid chromatography-tandem mass spectrometry multiple reaction monitoring (LC-MS/MS MRM) assay. Testing of a blinded set of isolates containing these four Burkholderia species demonstrated 50/50 correct automatic negative calls (100% accuracy with a 95% confidence interval [CI] of 92.9 to 100%), and 70/70 correct automatic species-level positive identifications (100% accuracy with 95% CI 94.9 to 100%) after accounting for a single initial incorrect identification due to a preanalytic error, correctly identified on retesting. The approach to analysis described here is applicable to other pathogen groups for which development of diagnostic classification methods is complicated by interspecies recombination.
Keywords: clinical microbiology; computational biology; genomics; mass spectrometry; network phylogeny; proteomics.