Background/objective: Acute myeloid leukemia (AML) is a progressive and malignant cancer of myelogenous blood cells, which disturbs the production of normal blood cells. Although several risk and genetic factors (AML-related genes) have been investigated, the concrete mechanism underlying the development of AML remains unclear. In view of this, it is crucial to develop an effective computational method for meaningfully characterizing AML genes and accurately predicting novel AML genes.
Methods: In this study, we integrated gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway annotations as features to characterize AML genes. We also provided an optimal set of features for predicting AML-related genes by using the minimum redundancy maximum relevance (mRMR) algorithm and dagging metaclassifier.
Results: We obtained 26 optimal GO terms that characterized AML genes well. Finally, we predicted 464 novel genes to provide clinical researchers with additional candidates and useful insights for further analysis of AML.
Discussion: An in-depth feature analysis indicated that the results are quite consistent with previous knowledge. We developed a systematic method to identify the possible underlying mechanism of AML by analyzing the related genes. Our method has the ability to identify the types of features that are optimal to meaningfully interpret AML and accurately predict more AML genes for further clinical researches.
Keywords: AML-related genes; Gene ontology; KEGG pathway; Prediction of AML genes.