Geospatial prediction of total soil carbon in European agricultural land based on deep learning

Sci Total Environ. 2024 Feb 20:912:169647. doi: 10.1016/j.scitotenv.2023.169647. Epub 2023 Dec 26.

Abstract

Accurate geospatial prediction of soil parameters provides a basis for large-scale digital soil mapping, making efficient use of the expensive and time-consuming process of field soil sampling. To date, few studies have used deep learning for geospatial prediction of soil parameters, but there is evidence that it may provide higher accuracy compared to machine learning methods. To address this research gap, this study proposed a deep neural network (DNN) for geospatial prediction of total soil carbon (TC) in European agricultural land and compared it with the eight most commonly used machine learning methods based on studies indexed in the Web of Science Core Collection. A total of 6209 preprocessed soil samples from the Geochemical mapping of agricultural and grazing land soil (GEMAS) dataset in heterogeneous agricultural areas covering 4,899,602 km2 in Europe were used. Prediction was performed based on 96 environmental covariates from climate and remote sensing sources, with extensive comprehensive hyperparameter tuning for all evaluated methods. DNN outperformed all evaluated machine learning methods (R2 = 0.663, RMSE = 9.595, MAE = 5.565), followed by Quantile Random Forest (QRF) (R2 = 0.635, RMSE = 25.993, MAE = 22.081). The ability of DNN to accurately predict small TC values and thus produce relatively low absolute residuals was a major reason for the higher prediction accuracy compared to machine learning methods. Climate parameters were the main factors in the achieved prediction accuracy, with 23 of the 25 environmental covariates with the highest variable importance being climate or land surface temperature parameters. These results demonstrate the superiority of DNN over machine learning methods for TC prediction, while highlighting the need for more recent soil sampling to assess the impact of climate change on TC content in European agricultural land.

Keywords: Climate; Deep neural network; Environmental covariates; GEMAS dataset; Remote sensing.