Several 3D-QSAR models were built based on 196 hepatitis C virus (HCV) NS5A protein inhibitors. The bioactivity values EC90 for three types of inhibitors, the wild type (GT1a) and two mutants (GT1a Y93H and GT1a L31V), were collected to build three datasets. The programs OMEGA and ROCS were used for generating conformations and aligning molecules of the dataset, respectively. Each dataset was randomly divided into a training set and a test set three times to reduce the contingency of only one random selection. QSAR models were computed by comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA). For the datasets GT1a, GT1a Y93H, and GT1a L31V, the best models CoMFA-INDX, CoMSIA-SEHA, and CoMSIA-SEHA showed an r2 value of 0.682 ± 0.033, 0.779 ± 0.036, and 0.782 ± 0.022 on the test sets, respectively. From the contour maps of the three best models, we summarized the favourable and unfavourable substituents on the tetracyclic core, the Z group, the proline group, and the valine group of inhibitors. We guessed the mutants could change the electrostatic surfaces of the wild type active pocket. In addition, we used ECFP analyses to find important substructures and could intuitively understand the results from QSAR models.
Keywords: CoMFA; CoMSIA; NS5A protein inhibitor; QSAR; extended-connectivity fingerprint (ECFP) analysis; hepatitis C virus.