Errors make half of public datasets unusable

How complete and reusable are publicly archived data in ecology and evolution? Illustration by Ainsley Seago (CC-BY 4.0).
How complete and reusable are publicly archived data in ecology and evolution? Illustration by Ainsley Seago (CC-BY 4.0).

A recent study of ecology publications showed that accompanying datasets are often incomplete or of such low quality that the data can be difficult or impossible to re-use.

The study surveyed the datasets accompanying 100 publications and found that 56% were incomplete, and 64% were archived in a way that partially or entirely prevented reuse. The authors listed a number of key recommendations that could help make the data more useful. These include:

  • Planning for the management of data at the start of the project
  • Using established repositories
  • Provide information about the data (metadata)
  • Use descriptive file names
  • Share raw data if possible
  • Use standard formats
  • Facilitate data aggregation – through standards and appropriate public databases
  • Perform quality control
  • Choose a publishing license
  • Decide on an embargo

And in a nice touch, the authors of the publication have released the data on the repository figshare: http://dx.doi.org/10.6084/m9.figshare.1393269

If you require assistance with any aspect of data management, please contact the library digital scholarship team at digischol-library@unimelb.edu.au.


Leave a Reply

Your email address will not be published. Required fields are marked *

Archives