Characterizing the walnut genome through analyses of BAC end sequences

Plant Mol Biol. 2012 Jan;78(1-2):95-107. doi: 10.1007/s11103-011-9849-y. Epub 2011 Nov 19.

Abstract

Persian walnut (Juglans regia L.) is an economically important tree for its nut crop and timber. To gain insight into the structure and evolution of the walnut genome, we constructed two bacterial artificial chromosome (BAC) libraries, containing a total of 129,024 clones, from in vitro-grown shoots of J. regia cv. Chandler using the HindIII and MboI cloning sites. A total of 48,218 high-quality BAC end sequences (BESs) were generated, with an accumulated sequence length of 31.2 Mb, representing approximately 5.1% of the walnut genome. Analysis of repeat DNA content in BESs revealed that approximately 15.42% of the genome consists of known repetitive DNA, while walnut-unique repetitive DNA identified in this study constitutes 13.5% of the genome. Among the walnut-unique repetitive DNA, Julia SINE and JrTRIM elements represent the first identified walnut short interspersed element (SINE) and terminal-repeat retrotransposon in miniature (TRIM) element, respectively; both types of elements are abundant in the genome. As in other species, these SINEs and TRIM elements could be exploited for developing repeat DNA-based molecular markers in walnut. Simple sequence repeats (SSR) from BESs were analyzed and found to be more abundant in BESs than in expressed sequence tags. The density of SSR in the walnut genome analyzed was also slightly higher than that in poplar and papaya. Sequence analysis of BESs indicated that approximately 11.5% of the walnut genome represents a coding sequence. This study is an initial characterization of the walnut genome and provides the largest genomic resource currently available; as such, it will be a valuable tool in studies aimed at genetically improving walnut.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Chromosomes, Artificial, Bacterial / genetics*
  • DNA, Plant / chemistry
  • DNA, Plant / genetics*
  • Expressed Sequence Tags
  • Genetic Markers / genetics
  • Genome, Plant / genetics*
  • Genomic Library
  • Juglans / genetics*
  • Microsatellite Repeats / genetics
  • Molecular Sequence Data
  • Open Reading Frames / genetics
  • Plant Proteins / genetics
  • Retroelements / genetics
  • Sequence Analysis, DNA / methods*
  • Sequence Homology, Nucleic Acid
  • Short Interspersed Nucleotide Elements / genetics

Substances

  • DNA, Plant
  • Genetic Markers
  • Plant Proteins
  • Retroelements