Optimization of Assembly Pipeline may Improve the Sequence of the Chloroplast Genome in Quercus spinosa

Sci Rep. 2018 Jun 11;8(1):8906. doi: 10.1038/s41598-018-27298-0.

Abstract

Obtaining chloroplast (cp) genome sequence is necessary for studying physiological roles in plants. However, it is difficult to use traditional sequencing methods to get cp genome sequences because of the complex procedures of preparing templates. With the advent of next-generation sequencing technology, massive genome sequences can be produced. Thus, a good pipeline to assemble next-generation sequence reads with optimized k-mer length is essential to get whole cp genome sequences. Moreover, adjustment of other parameters is also very important, especially for the assembly of the cp genome. In this study, we developed a pipeline to generate the cp genome for Quercus spinosa. When Quercus rubra was used as a reference, we achieved coverage of 97.75% after optimizing k-mer length as well as other parameters. The efficiency of the pipeline makes it a useful method for cp genome construction in plants. It also provides great perspective on the analysis of cp genome characteristics and evolution.

MeSH terms

  • Computational Biology / methods*
  • DNA, Chloroplast / chemistry
  • DNA, Chloroplast / genetics
  • Genome, Chloroplast*
  • High-Throughput Nucleotide Sequencing / methods
  • Quercus / genetics*
  • Sequence Analysis, DNA / methods*

Substances

  • DNA, Chloroplast