Fast and accurate read alignment for resequencing

John C Mu; Hui Jiang; Amirhossein Kiani; Marghoob Mohiyuddin; Narges Bani Asadi; Wing H Wong

doi:10.1093/bioinformatics/bts450

Fast and accurate read alignment for resequencing

Bioinformatics. 2012 Sep 15;28(18):2366-73. doi: 10.1093/bioinformatics/bts450. Epub 2012 Jul 18.

Authors

John C Mu¹, Hui Jiang, Amirhossein Kiani, Marghoob Mohiyuddin, Narges Bani Asadi, Wing H Wong

Affiliation

¹ Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA.

Abstract

Motivation: Next-generation sequence analysis has become an important task both in laboratory and clinical settings. A key stage in the majority sequence analysis workflows, such as resequencing, is the alignment of genomic reads to a reference genome. The accurate alignment of reads with large indels is a computationally challenging task for researchers.

Results: We introduce SeqAlto as a new algorithm for read alignment. For reads longer than or equal to 100 bp, SeqAlto is up to 10 × faster than existing algorithms, while retaining high accuracy and the ability to align reads with large (up to 50 bp) indels. This improvement in efficiency is particularly important in the analysis of future sequencing data where the number of reads approaches many billions. Furthermore, SeqAlto uses less than 8 GB of memory to align against the human genome. SeqAlto is benchmarked against several existing tools with both real and simulated data.

Availability: Linux and Mac OS X binaries free for academic use are available at http://www.stanford.edu/group/wonglab/seqalto

Contact: whwong@stanford.edu.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Algorithms
Genome, Human
Genomics
High-Throughput Nucleotide Sequencing*
Humans
INDEL Mutation
Sequence Alignment / methods*
Sequence Analysis, DNA*
Software*

Abstract

Publication types

MeSH terms

Grants and funding