Retrieving biodiversity data from multiple sources: making secondary data standardised and accessible

Biodivers Data J. 2024 Sep 20:12:e133775. doi: 10.3897/BDJ.12.e133775. eCollection 2024.

Abstract

Biodiversity data, particularly species occurrence and abundance, are indispensable for testing empirical hypothesis in natural sciences. However, datasets built for research programmes do not often meet FAIR (findable, accessible, interoperable and reusable) principles, which raises questions about data quality, accuracy and availability. The 21st century has markedly been a new era for data science and analytics and every effort to aggregate, standardise, filter and share biodiversity data from multiple sources have become increasingly necessary. In this study, we propose a framework for refining and conforming secondary biodiversity data to FAIR standards to make them available for use such as macroecological modelling and other studies. We relied on a Darwin Core base model to standardise and further facilitate the curation and validation of data related including the occurrence and abundance of multiple taxa of a region that encompasses estuarine ecosystems in an ecotonal area bordering the easternmost Amazonia. We further discuss the significance of feeding standardised public data repositories to advance scientific progress and highlight their role in contributing to the biodiversity management and conservation.

Keywords: Darwin Core standard; FAIR data; Golfão Maranhense; secondary data.