Errors in the interpretation of copy number variations due to the use of public databases as a reference

Cancer Genet. 2014 Apr;207(4):164-7. doi: 10.1016/j.cancergen.2014.03.001. Epub 2014 Mar 14.

Abstract

The identification of new cryptic deletions and duplications can be used to improve prognostic classification in cancer. To obtain accurate results, it is necessary to discriminate between somatic alterations in the tumor cell and germline polymorphisms. For this purpose, copy number variation (CNV) public databases have been used as a reference. Nevertheless, the use of these databases may lead to erroneous results. Our main goal was to explore the limitations of the use of CNV databases, such as the Database of Genomic Variants (DGV), as the reference. To that end, we used pediatric acute lymphoblastic leukemia (ALL) as a model. We analyzed the genome-wide copy number profile of 23 ALL patients and conducted a comparison of the results obtained using the DGV with those obtained using the normal sample from the patient as the reference. Using only the DGV, 19% of alterations and 41% of polymorphisms were erroneously catalogued. Our results support the hypothesis that with the use of databases such as the DGV as the reference, a high percentage of the variations can be erroneously classified.

Keywords: Acute lymphoblastic leukemia; DGV; deletions; duplications; germline sample.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Child
  • DNA Copy Number Variations*
  • Databases, Genetic / standards*
  • Databases, Genetic / statistics & numerical data*
  • Gene Dosage
  • Humans
  • Mutation
  • Polymorphism, Genetic
  • Precursor Cell Lymphoblastic Leukemia-Lymphoma / diagnosis
  • Precursor Cell Lymphoblastic Leukemia-Lymphoma / genetics*
  • Prognosis
  • Reference Standards
  • Reproducibility of Results