Genomic prediction in Brassica napus: evaluating the benefit of imputed whole-genome sequencing data

Genome. 2024 Jul 1;67(7):210-222. doi: 10.1139/gen-2023-0126. Epub 2024 May 6.

Abstract

Advances in sequencing technology allow whole plant genomes to be sequenced with high quality. Combining genotypic and phenotypic data in genomic prediction helps breeders to select crossing partners in partially phenotyped populations. In plant breeding programs, the cost of sequencing entire breeding populations still exceeds available genotyping budgets. Hence, the method for genotyping is still mainly single nucleotide polymorphism (SNP) arrays; however, arrays are unable to assess the entire genome- and population-wide diversity. A compromise involves genotyping the entire population using an SNP array and a subset of the population with whole-genome sequencing. Both datasets can then be used to impute markers from whole-genome sequencing onto the entire population. Here, we evaluate whether imputation of whole-genome sequencing data enhances genomic predictions, using data from a nested association mapping population of rapeseed (Brassica napus). Employing two cross-validation schemes that mimic scenarios for the prediction of close and distant relatives, we show that imputed marker data do not significantly improve prediction accuracy, likely due to redundancy in relationship estimates and imputation errors. In simulation studies, only small improvements were observed, further corroborating the findings. We conclude that SNP arrays are already equipped with the information that is added by imputation through relationship and linkage disequilibrium.

Keywords: SNP markers; genomic prediction; imputation; whole-genome sequencing.

MeSH terms

  • Brassica napus* / genetics
  • Genome, Plant*
  • Genomics / methods
  • Genotype
  • Linkage Disequilibrium
  • Plant Breeding / methods
  • Polymorphism, Single Nucleotide*
  • Whole Genome Sequencing* / methods