Index hopping on the Illumina HiseqX platform and its consequences for ancient DNA studies

Tom van der Valk; Francesco Vezzi; Mattias Ormestad; Love Dalén; Katerina Guschanski

doi:10.1111/1755-0998.13009

Index hopping on the Illumina HiseqX platform and its consequences for ancient DNA studies

Mol Ecol Resour. 2020 Sep;20(5):1171-1181. doi: 10.1111/1755-0998.13009. Epub 2019 May 5.

Authors

Tom van der Valk¹, Francesco Vezzi², Mattias Ormestad², Love Dalén³, Katerina Guschanski¹

Affiliations

¹ Animal Ecology, Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden.
² Science for Life Laboratory, Solna, Sweden.
³ Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden.

PMID: 30848092
DOI: 10.1111/1755-0998.13009

Abstract

The high-throughput capacities of the Illumina sequencing platforms and the possibility to label samples individually have encouraged wide use of sample multiplexing. However, this practice results in read misassignment (usually <1%) across samples sequenced on the same lane. Alarmingly high rates of read misassignment of up to 10% were reported for lllumina sequencing machines with exclusion amplification chemistry. This may make use of these platforms prohibitive, particularly in studies that rely on low-quantity and low-quality samples, such as historical and archaeological specimens. Here, we use barcodes, short sequences that are ligated to both ends of the DNA insert, to directly quantify the rate of index hopping in 100-year old museum-preserved gorilla (Gorilla beringei) samples. Correcting for multiple sources of noise, we identify on average 0.470% of reads containing a hopped index. We show that sample-specific quantity of misassigned reads depends on the number of reads that any given sample contributes to the total sequencing pool, so that samples with few sequenced reads receive the greatest proportion of misassigned reads. This particularly affects ancient DNA samples, as these frequently differ in their DNA quantity and endogenous content. Through simulations we show that even low rates of index hopping, as reported here, can lead to biases in ancient DNA studies when multiplexing samples with vastly different quantities of endogenous material.

Keywords: ancient DNA; index switching; multiplexing; museum specimens; next-generation sequencing; read misassignment.

MeSH terms

Animals
DNA
DNA Barcoding, Taxonomic
DNA, Ancient*
Gorilla gorilla / genetics
High-Throughput Nucleotide Sequencing* / methods
Sequence Analysis, DNA* / methods

Substances

DNA, Ancient
DNA

Abstract

MeSH terms

Substances

Grants and funding