Gaussian processes retrieval of crop traits in Google Earth Engine based on Sentinel-2 top-of-atmosphere data

Remote Sens Environ. 2022 Mar 4:273:112958. doi: 10.1016/j.rse.2022.112958. eCollection 2022 May.

Abstract

The unprecedented availability of optical satellite data in cloud-based computing platforms, such as Google Earth Engine (GEE), opens new possibilities to develop crop trait retrieval models from the local to the planetary scale. Hybrid retrieval models are of interest to run in these platforms as they combine the advantages of physically- based radiative transfer models (RTM) with the flexibility of machine learning regression algorithms. Previous research with GEE primarily relied on processing bottom-of-atmosphere (BOA) reflectance data, which requires atmospheric correction. In the present study, we implemented hybrid models directly into GEE for processing Sentinel-2 (S2) Level-1C (L1C) top-of-atmosphere (TOA) reflectance data into crop traits. To achieve this, a training dataset was generated using the leaf-canopy RTM PROSAIL in combination with the atmospheric model 6SV. Gaussian process regression (GPR) retrieval models were then established for eight essential crop traits namely leaf chlorophyll content, leaf water content, leaf dry matter content, fractional vegetation cover, leaf area index (LAI), and upscaled leaf variables (i.e., canopy chlorophyll content, canopy water content and canopy dry matter content). An important pre-requisite for implementation into GEE is that the models are sufficiently light in order to facilitate efficient and fast processing. Successful reduction of the training dataset by 78% was achieved using the active learning technique Euclidean distance-based diversity (EBD). With the EBD-GPR models, highly accurate validation results of LAI and upscaled leaf variables were obtained against in situ field data from the validation study site Munich-North-Isar (MNI), with normalized root mean square errors (NRMSE) from 6% to 13%. Using an independent validation dataset of similar crop types (Italian Grosseto test site), the retrieval models showed moderate to good performances for canopy-level variables, with NRMSE ranging from 14% to 50%, but failed for the leaf-level estimates. Obtained maps over the MNI site were further compared against Sentinel-2 Level 2 Prototype Processor (SL2P) vegetation estimates generated from the ESA Sentinels' Application Platform (SNAP) Biophysical Processor, proving high consistency of both retrievals (R 2 from 0.80 to 0.94). Finally, thanks to the seamless GEE processing capability, the TOA-based mapping was applied over the entirety of Germany at 20 m spatial resolution including information about prediction uncertainty. The obtained maps provided confidence of the developed EBD-GPR retrieval models for integration in the GEE framework and national scale mapping from S2-L1C imagery. In summary, the proposed retrieval workflow demonstrates the possibility of routine processing of S2 TOA data into crop traits maps at any place on Earth as required for operational agricultural applications.

Keywords: Active learning (AL); Atmosphere radiative transfer model; Biophysical and biochemical crop traits; Euclidean distance-based diversity (EBD); Gaussian processes (GP); Google Earth Engine; Hybrid retrieval methods; Sentinel-2; Top-of-atmosphere reflectance; Uncertainty estimates.