Thing 22: File Management 101

Managing your files effectively is important for lots of reasons. It can help you meet the expectations of journals and funders, enable reproducibility, defend against accusations of fraud or breaches of research integrity, and ensure your files can be preserved and reused in future. Not only that, it can save you time and tears! In this post Gene Melzack shares a case study to illustrate why it’s important to get your file management right.

File management case study: The History of Blood Donation

Korina was working on a research project looking at the history of blood donation. This involved gathering digitised archival sources from around the world for analysis. For an earlier project, Korina had organised her files in folders according to subject. She was wondering whether to do the same for her new project, too, but was concerned about whether this was the right approach. In the previous project, she frequently found her folder structure changing dynamically as new themes emerged from her primary sources during her research. This meant her files frequently got moved around as the folder structure and her thinking changed, and she found it hard to remember where she had put things when she was looking for them again later.

She also found that some of her sources spoke to multiple different themes, so she couldn’t decide which folder to place them in. She was wondering whether she should make multiple copies of them and put them in two or more different folders. 

The recommendation for Korina was, for this new research project, to adopt a folder structure that would be stable for the duration of the project, and that would provide just one unambiguous folder for each file to go in. The way to achieve this was to design a folder structure based on the origins of the source files. Korina was collecting electronic copies of the source files during visits to archives around the world. Each file had an unambiguous origin, which would remain stable for the duration of the research project.

It was also recommended that Korina develop a file naming convention to assist with sorting files and tracking them if they got emailed or moved. This file naming convention was based on the archive the file originated from and the unique identifiers given to the document by that archive, such as a collection or article number. A recommendation was also made to maintain a simple index of documents in each folder. The index was used as a simple codebook for the documents, recording basic information such as what acronyms stand for and which collections the collection codes refer to.

Considerations

At the start of your project, it’s a good idea to think ahead and decide on some principles on how to organise and name your files that will make sense for the duration of the project – and then make sure you actually stick to those conventions.

Organising principles

Choose a folder structure:

  • that will remain stable as your research progresses
  • with one unambiguous place to put each data file
  • that fits your research questions
  • that doesn’t require a lot of laborious reorganisation
  • that is as simple as you can make it – don’t overcomplicate things
  • with as few nested levels as possible

File naming principles

Machine-readable:

  • be consistent
  • avoid spaces, case sensitivity, and accented characters
  • avoid special characters such as: \ / < > | ” ? [ ] ; = + & $
  • make deliberate use of delimiters such as – and _

Human-readable:

  • keep file names short but meaningful
  • use keywords that will help you quickly and easily identify the file
  • include any unique identifiers, e.g. collection number, article number
  • Plays well with default ordering:
  • put something numeric first
  • left pad numbers, eg. 01-10, 001-100
  • use date format YYYY-MM-DD

Learn More

Keep an eye out for the University of Melbourne’s File Management 101: Taming the digital chaos training sessions, which are currently being offered as webinars. These are typically run once every two months.

The Managing Data @Melbourne online training includes more information about file management. Current graduate researchers will be automatically enrolled and need to accept the invitation through your dashboard. You can also find the program through the course catalogue.

About the Authors

Gene Melzack is a Data Steward with the Digital Stewardship (Research) team.

 

Images in this post by Gene Melzack, published under a CC BY 4.0 license.

Header image by Ag Ku from Pixabay.


Leave a Reply

Your email address will not be published. Required fields are marked *