Completeness of Digital Accessible Knowledge (DAK) about terrestrial mammals in the Iberian Peninsula.
Nora EscribanoDavid GaliciaArturo Hugo AriñoPublished in: PloS one (2019)
The advent of online data aggregator infrastructures has facilitated the accumulation of Digital Accessible Knowledge (DAK) about biodiversity. Despite the vast amount of freely available data records, their usefulness for research depends on completeness of each body of data regarding their spatial, temporal and taxonomic coverage. In this paper, we assess the completeness of DAK about terrestrial mammals distributed across the Iberian Peninsula. We compiled a dataset with all records about mammals occurring in the Iberian Peninsula available in the Global Biodiversity Information Facility and in the national atlases from Portugal and Spain. After cleaning the dataset of errors as well as records lacking collection dates or not determined to species level, we assigned all occurrences to a 10-km grid. We assessed inventory completeness by calculating the ratio between observed and expected richness (based on the Chao2 richness index) in each grid cell and classified cells as well-sampled or under-sampled. We evaluated survey coverage of well-sampled cells along four environmental gradients and temporal coverage. Out of 796,283 retrieved records, quality issues led us to remove 616,141 records unfit for this use. The main reason for discarding records was missing collection dates. Only 25.95% cells contained enough records to robustly estimate completeness. The DAK about terrestrial mammals from the Iberian Peninsula was low, and spatially and temporally biased. Out of 5,874 cells holding data, only 620 (9.95%) were classified as well-sampled. Moreover, well-sampled cells were geographically aggregated and reached inventory completeness over the same temporal range. Despite the increasing availability of DAK, its usefulness is still compromised by quality issues and gaps in data. Future work should therefore focus on increasing data quality, in addition to mobilizing unpublished data.