In prokaryotes, Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) and CRISPR-associated protein (Cas) systems constitute adaptive immune systems against mobile genetic elements (MGEs). Here, we introduce the Markov cluster algorithm (MCL) to Makarova et al.'s method in order to select a more reasonable profile. Additionally, our new Maximum Continuous Cas Subcluster (MCCS) method helps identification of tightly clustered loci. The comparison with two other commonly used programs shows that the method could identify Cas proteins with higher accuracy and lower Additional Prediction Rate (APR). Moreover, we developed a web-based server, CasLocusAnno (http://cefg.uestc.cn/CasLocusAnno), capable of annotating Cas proteins, cas loci and their (sub)types less than ~ 28 s following the whole proteome sequence submission. Its standalone version can be downloaded at https://github.com/RiversDong/CasLocusAnno.
Keywords: (sub)type annotation; Cas protein annotation; cas locus annotation.
© 2019 Federation of European Biochemical Societies.