Unlocking HDR-mediated nucleotide editing by identifying high-efficiency target sites using machine learning

Sci Rep. 2019 Feb 26;9(1):2788. doi: 10.1038/s41598-019-39142-0.

Abstract

Editing individual nucleotides is a crucial component for validating genomic disease association. It is currently hampered by CRISPR-Cas-mediated "base editing" being limited to certain nucleotide changes, and only achievable within a small window around CRISPR-Cas target sites. The more versatile alternative, HDR (homology directed repair), has a 3-fold lower efficiency with known optimization factors being largely immutable in experiments. Here, we investigated the variable efficiency-governing factors on a novel mouse dataset using machine learning. We found the sequence composition of the single-stranded oligodeoxynucleotide (ssODN), i.e. the repair template, to be a governing factor. Furthermore, different regions of the ssODN have variable influence, which reflects the underlying mechanism of the repair process. Our model improves HDR efficiency by 83% compared to traditionally chosen targets. Using our findings, we developed CUNE (Computational Universal Nucleotide Editor), which enables users to identify and design the optimal targeting strategy using traditional base editing or - for-the-first-time - HDR-mediated nucleotide changes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • CRISPR-Cas Systems / genetics
  • DNA Breaks, Double-Stranded
  • DNA Repair*
  • Gene Editing*
  • Machine Learning*
  • Mice
  • Mice, Inbred C57BL
  • Mutation
  • Oligodeoxyribonucleotides / metabolism
  • RNA, Guide, CRISPR-Cas Systems / metabolism

Substances

  • Oligodeoxyribonucleotides
  • RNA, Guide, CRISPR-Cas Systems