Integrating molecular descriptors for enhanced prediction: Shedding light on the potential of pH to model hydrated electron reaction rates for organic compounds

Chemosphere. 2024 Feb:349:140984. doi: 10.1016/j.chemosphere.2023.140984. Epub 2023 Dec 18.

Abstract

Hydrated electron reaction rate constant (ke-aq) is an important parameter to determine reductive degradation efficiency and to mitigate the ecological risk of organic compounds (OCs). However, OC species morphology and the concentration of hydrated electrons (e-aq) in water vary with pH, complicating OC fate assessment. This study introduced the environmental variable of pH, to develop models for ke-aq for 701 data points using 3 descriptor types: (i) molecular descriptors (MD), (ii) quantum chemical descriptors (QCD), and (iii) the combination of both (MD + QCD). Models were screened using 2 descriptor screening methods (MLR and RF) and 14 machine learning (ML) algorithms. The introduction of QCDs that characterized the electronic structure of OCs greatly improved the performance of models while ensuring the need for fewer descriptors. The optimal model MLR-XGBoost(MD + QCD), which included pH, achieved the most satisfactory prediction: R2tra = 0.988, Q2boot = 0.861, R2test = 0.875 and Q2test = 0.873. The mechanistic interpretation using the SHAP method further revealed that QCDs, polarizability, volume, and pH had a great influence on the reductive degradation of OCs by e-aq. Overall, the electrochemical parameters (QCDs, pH) related to the solvent and solute are of significance and should be considered in any future ML modeling that assesses the fate of OCs in aquatic environment.

Keywords: Hydrated electron; Machine learning; Model interpretation; Reaction rate constants; pH.

MeSH terms

  • Electrons*
  • Hydrogen-Ion Concentration
  • Organic Chemicals / chemistry
  • Quantitative Structure-Activity Relationship*
  • Solutions

Substances

  • Organic Chemicals
  • Solutions