CRISPRlnc: a machine learning method for lncRNA-specific single-guide RNA design of CRISPR/Cas9 system

Brief Bioinform. 2024 Jan 22;25(2):bbae066. doi: 10.1093/bib/bbae066.

Abstract

CRISPR/Cas9 is a promising RNA-guided genome editing technology, which consists of a Cas9 nuclease and a single-guide RNA (sgRNA). So far, a number of sgRNA prediction softwares have been developed. However, they were usually designed for protein-coding genes without considering that long non-coding RNA (lncRNA) genes may have different characteristics. In this study, we first evaluated the performances of a series of known sgRNA-designing tools in the context of both coding and non-coding datasets. Meanwhile, we analyzed the underpinnings of their varied performances on the sgRNA's specificity for lncRNA including nucleic acid sequence, genome location and editing mechanism preference. Furthermore, we introduce a support vector machine-based machine learning algorithm named CRISPRlnc, which aims to model both CRISPR knock-out (CRISPRko) and CRISPR inhibition (CRISPRi) mechanisms to predict the on-target activity of targets. CRISPRlnc combined the paired-sgRNA design and off-target analysis to achieve one-stop design of CRISPR/Cas9 sgRNAs for non-coding genes. Performance comparison on multiple datasets showed that CRISPRlnc was far superior to existing methods for both CRISPRko and CRISPRi mechanisms during the lncRNA-specific sgRNA design. To maximize the availability of CRISPRlnc, we developed a web server (http://predict.crisprlnc.cc) and made it available for download on GitHub.

Keywords: CRISPR/Cas9; lncRNA; machine learning; sgRNA.

MeSH terms

  • CRISPR-Cas Systems
  • Gene Editing
  • Machine Learning
  • RNA, Guide, CRISPR-Cas Systems*
  • RNA, Long Noncoding* / genetics

Substances

  • RNA, Guide, CRISPR-Cas Systems
  • RNA, Long Noncoding