Fit for purpose? Identifying and resolving quality issues with marine biodiversity datasets in R
Bosch , S.; Provoost, P.; Appeltans, W. (2018). Fit for purpose? Identifying and resolving quality issues with marine biodiversity datasets in R. PeerJ 6: e26776v1. https://doi.org/10.7287/peerj.preprints.26776v1
Millions of marine species occurrences and abundances can be accessed through the Ocean Biogeographic Information System (OBIS), which are then often combined with data from additional sources such as GBIF, citizen science projects, scientific literature and personal communications. However, the quality of the available data is variable and it thus needs to be scrutinized in order to get a dataset that is fit for purpose.To help this process, as well as increase the quality of the data before they are published in OBIS, we developed the obistools R package. It allows users to identify and resolve common data errors such as taxonomic, spatial, temporal and measurement issues. The package combines and builds on existing services made available by the World Register of Marine Species as well as some new OBIS home-made services. The interactive interface provides a series of strict and fuzzy quality checks ranging from longitude/latitude checks to environmental outlier detection. These checks, in combination with pre-defined constraints based on for instance the physiological knowledge of the species and the expected spatial extent, can then be used to evaluate if specific records can be used in the analysis to obtain the final dataset for the analysis.
All data in the Integrated Marine Information System (IMIS) is subject to the VLIZ privacy policy