Computing similarity between structural environments of mutagenicity alerts

Suman K Chakravarti; Roustem D Saiakhov

doi:10.1093/mutage/gey032

Computing similarity between structural environments of mutagenicity alerts

Mutagenesis. 2019 Mar 6;34(1):55-65. doi: 10.1093/mutage/gey032.

Authors

Suman K Chakravarti¹, Roustem D Saiakhov¹

Affiliation

¹ MultiCASE Inc., Chagrin Blvd, Suite, Beachwood, OH, USA.

PMID: 30346583
DOI: 10.1093/mutage/gey032

Abstract

This article describes a method to generate molecular fingerprints from structural environments of mutagenicity alerts and calculate similarity between them. This approach was used to improve classification accuracy of alerts and for searching structurally similar analogues of an alerting chemical. It builds fingerprints using molecular fragments from the vicinity of the alerts and automatically accounts for the activating and deactivating/mitigating features of alerts needed for accurate predictions. This study also demonstrates the usefulness of transfer learning in which a distributed representation of chemical fragments was first trained on millions of unlabelled chemicals and then used for generating fingerprints and similarity search on smaller data sets labelled with Ames test outcomes. The distributed fingerprints gave better prediction performance and increased coverage compared to traditional binary fingerprints. The methodology was applied to four common mutagenic functionalities-primary aromatic amine, aromatic nitro, epoxide and alkyl chloride. Effects of various hyperparameters on prediction accuracy and test coverage for the k-nearest neighbours prediction method are also described, e.g. similarity thresholds, number of neighbours and size of the alert environment.

MeSH terms

Amines / chemistry*
Amines / toxicity
Epoxy Compounds / chemistry*
Epoxy Compounds / toxicity
Mutagenesis / drug effects
Mutagenicity Tests / methods
Mutagens / chemistry*
Mutagens / toxicity
Nitro Compounds / chemistry*
Nitro Compounds / toxicity

Substances

Amines
Epoxy Compounds
Mutagens
Nitro Compounds