Best practice for biodiversity data management and publication
biodiversity informatics, conservation, data standards, global biodiversity information facility
There is increasing pressure from the scientific community, including funding agencies, journals and peers, for authors to publish the biodiversity data used in published articles and other scientific literature. This enables reproducibility of research and creates new opportunities for integrating data between research projects and analysing data in additional ways. The long-term availability of data is especially important in conservation science because field data can be costly to collect. In addition, historic data, especially on threatened species and their associated biota, become more valuable over time. This paper summarises current standards and best practices for the management and publication of biodiversity data. It includes recommendations for citing sources of species determination and standards for formatting species distribution data. Whenever possible, data should be published for inclusion in data access platforms that integrate datasets (e.g. GBIF, GenBank) and so enable new analyses and broader impact. Data centres (e.g. PANGAEA) provide added value in quality checks on data. A minimum standard recommended is that data should be permanently archived in an online, open-access repository with sufficient metadata for potential users to understand how and why they were collected.