Improved algorithms for parsing ESLTAGs: a grammatical model suitable for RNA pseudoknots

IEEE/ACM Trans Comput Biol Bioinform. 2010 Oct-Dec;7(4):619-27. doi: 10.1109/TCBB.2010.54.

Abstract

Formal grammars have been employed in biology to solve various important problems. In particular, grammars have been used to model and predict RNA structures. Two such grammars are Simple Linear Tree Adjoining Grammars (SLTAGs) and Extended SLTAGs (ESLTAGs). Performances of techniques that employ grammatical formalisms critically depend on the efficiency of the underlying parsing algorithms. In this paper, we present efficient algorithms for parsing SLTAGs and ESLTAGs. Our algorithm for SLTAGs parsing takes O(min{m,n⁴}) time and O(min{m,n⁴}) space, where m is the number of entries that will ever be made in the matrix M (that is normally used by TAG parsing algorithms). Our algorithm for ESLTAGs parsing takes O(min{m,n⁴}) time and O(min{m,n⁴}) space. We show that these algorithms perform better, in practice, than the algorithms of Uemura et al.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Base Sequence
  • Models, Molecular
  • Molecular Sequence Data
  • Nucleic Acid Conformation
  • RNA / chemistry*
  • Software*

Substances

  • RNA