Space-conserving optimal DNA-protein alignment

Pang Ko; Mahesh Narayanan; Anantharaman Kalyanaraman; Srinivas Aluru

doi:10.1109/csb.2004.1332420

Space-conserving optimal DNA-protein alignment

Proc IEEE Comput Syst Bioinform Conf. 2004:80-8. doi: 10.1109/csb.2004.1332420.

Authors

Pang Ko¹, Mahesh Narayanan, Anantharaman Kalyanaraman, Srinivas Aluru

Affiliation

¹ Department of Electrical and Computer Engineering, Iowa State University, USA. kopang@iastate.edu.

PMID: 16448002
DOI: 10.1109/csb.2004.1332420

Abstract

DNA-protein alignment algorithms can be used to discover coding sequences in a genomic sequence, if the corresponding protein derivatives are known. They can also be used to identify potential coding sequences of a newly sequenced genome, by using proteins from related species. Previously known algorithms either solve a simplified formulation, or sacrifice optimality to achieve practical implementation. In this paper, we present a comprehensive formulation of the DNA-protein alignment problem, and an algorithm to compute the optimal alignment in O(mn) time using only four tables of size (m + 1) x (n + 1), where m and n are the lengths of the DNA and protein sequences, respectively. We also developed a Protein and DNA Alignment program PanDA that implements the proposed solution. Experimental results indicate that our algorithm produces high quality alignments.

Publication types

Evaluation Study
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Algorithms*
Base Sequence
DNA / genetics*
Data Compression / methods
Molecular Sequence Data
Proteins / genetics*
Sequence Alignment / methods*
Sequence Analysis, DNA / methods*
Sequence Analysis, Protein / methods*

Substances

Proteins
DNA