We observed that current high-throughput sequencing approaches only detected a fraction of the full size-spectrum of insertions, deletions, and copy number variants compared with a previously published, Sanger-sequenced human genome. The sensitivity for detection was the lowest in the 100- to 10,000-bp size range, and at DNA repeats, with copy number gains harder to delineate than losses. We discuss strategies for discovering the full spectrum of genetic variation necessary for disease association studies.
Keywords: copy number variation; genome variation annotation; high-throughput sequencing; insertion/deletion.