Transcriptome analysis based on next-generation sequencing of non-model plants producing specialized metabolites of biotechnological interest

Mei Xiao; Ye Zhang; Xue Chen; Eun-Jeong Lee; Carla J S Barber; Romit Chakrabarty; Isabel Desgagné-Penix; Tegan M Haslam; Yeon-Bok Kim; Enwu Liu; Gillian MacNevin; Sayaka Masada-Atsumi; Darwin W Reed; Jake M Stout; Philipp Zerbe; Yansheng Zhang; Joerg Bohlmann; Patrick S Covello; Vincenzo De Luca; Jonathan E Page; Dae-Kyun Ro; Vincent J J Martin; Peter J Facchini; Christoph W Sensen

doi:10.1016/j.jbiotec.2013.04.004

Transcriptome analysis based on next-generation sequencing of non-model plants producing specialized metabolites of biotechnological interest

J Biotechnol. 2013 Jul 10;166(3):122-34. doi: 10.1016/j.jbiotec.2013.04.004. Epub 2013 Apr 16.

Affiliation

¹ Department of Biochemistry and Molecular Biology, University of Calgary, 3330 Hospital Drive NW, Calgary, Alberta T2N 4N1, Canada.

PMID: 23602801
DOI: 10.1016/j.jbiotec.2013.04.004

Abstract

Plants produce a vast array of specialized metabolites, many of which are used as pharmaceuticals, flavors, fragrances, and other high-value fine chemicals. However, most of these compounds occur in non-model plants for which genomic sequence information is not yet available. The production of a large amount of nucleotide sequence data using next-generation technologies is now relatively fast and cost-effective, especially when using the latest Roche-454 and Illumina sequencers with enhanced base-calling accuracy. To investigate specialized metabolite biosynthesis in non-model plants we have established a data-mining framework, employing next-generation sequencing and computational algorithms, to construct and analyze the transcriptomes of 75 non-model plants that produce compounds of interest for biotechnological applications. After sequence assembly an extensive annotation approach was applied to assign functional information to over 800,000 putative transcripts. The annotation is based on direct searches against public databases, including RefSeq and InterPro. Gene Ontology (GO), Enzyme Commission (EC) annotations and associated Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway maps are also collected. As a proof-of-concept, the selection of biosynthetic gene candidates associated with six specialized metabolic pathways is described. A web-based BLAST server has been established to allow public access to assembled transcriptome databases for all 75 plant species of the PhytoMetaSyn Project (www.phytometasyn.ca).

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Biotechnology / methods
Computational Biology*
Data Mining / methods
Databases, Genetic*
Gene Expression Profiling*
High-Throughput Nucleotide Sequencing
Metabolic Networks and Pathways / genetics*
Molecular Sequence Annotation
Phylogeny
Plants / genetics*
Plants / metabolism*
Sequence Alignment
Sequence Analysis
Transcriptome*