CloudSEN12+: The largest dataset of expert-labeled pixels for cloud and cloud shadow detection in Sentinel-2

Data Brief. 2024 Aug 19:56:110852. doi: 10.1016/j.dib.2024.110852. eCollection 2024 Oct.

Abstract

Detecting and screening clouds is the first step in most optical remote sensing analyses. Cloud formation is diverse, presenting many shapes, thicknesses, and altitudes. This variety poses a significant challenge to the development of effective cloud detection algorithms, as most datasets lack an unbiased representation. To address this issue, we have built CloudSEN12+, a significant expansion of the CloudSEN12 dataset. This new dataset doubles the expert-labeled annotations, making it the largest cloud and cloud shadow detection dataset for Sentinel-2 imagery up to date. We have carefully reviewed and refined our previous annotations to ensure maximum trustworthiness. We expect CloudSEN12+ will be a valuable resource for the cloud detection research community.

Keywords: Cloud shadow; Global dataset; IRIS; Sentinel-2; Thin cloud; U-net.