Errors make half of public datasets unusable
A recent study of ecology publications showed that accompanying datasets are often incomplete or of such low quality that the data can be difficult or impossible to re-use.
The study surveyed the datasets accompanying 100 publications and found that 56% were incomplete, and 64% were archived in a way that partially or entirely prevented reuse. The authors listed a number of key recommendations that could help make the data more useful. These include:
- Planning for the management of data at the start of the project
- Using established repositories
- Provide information about the data (metadata)
- Use descriptive file names
- Share raw data if possible
- Use standard formats
- Facilitate data aggregation – through standards and appropriate public databases
- Perform quality control
- Choose a publishing license
- Decide on an embargo
And in a nice touch, the authors of the publication have released the data on the repository figshare: http://dx.doi.org/10.6084/m9.figshare.1393269
If you require assistance with any aspect of data management, please contact the library digital scholarship team at digischol-library@unimelb.edu.au.
Leave a Reply