Skip to main content

Research Data Management: Working With Data

Visualisatins of Big Data in the ICE room at Merchiston..Photographed for the 'Case For Support' fund raising brochure.

A Data Management Plan (DMP) is a way for you to outline your responsibilities as a researcher towards your data. At a glance your Data Management Plan will typically state what data will be created and how, what your plans are for sharing and preserving the data, as well as any restrictions that may apply. A good Data Management Plan will;

  • Help you make informed decisions and anticipate problems, 
  • Allow you to develop procedures before the fact,
  • Ensure the data are accurate, complete, reliable, and secure.
  • Avoid duplication, loss breech of security, and importantly,
  • Save you valuable time and effort.

Your research project is unique, so it follows that your Data Management Plan will reflect this. However on this page you will find plenty of resources and templates to help you along the way.

 

 

Anatomy of a Research Data Management Plan

This is where you include the essential information about your data and the project or projects it is associated with. Administrative information might include the date the DMP was drafted, and the data it was last updated, the official name and ID of the project, and a short description of the main objectives. If you have received any funding, or are working in partnership with an organisation it is a good idea to include this information as well. 

The names of anyone involved in processing the data should also be listed. This might include but are not limited to the principle investigator, the person responsible for drafting the DMP, as well as the people responsible for collecting, analysing, describing, and storing the data. If these people have ORCiD or ResearchIDs, you can include them here. Does using the dataset require specialist hardware or software? Or specialist skills and training? If so, list them here. 

It is also a good idea to list any institutional policies, legal requirements, or funder guidelines that apply to the data.

Here you can include a description of the collection methods that will be used to collect or create the data, as well as the type, format and volume that you anticipate the resulting data will take. If you are reusing an existing dataset, include information about that under this heading. Also include any information about the structure, versioning and naming schema for any folders or files. Lastly, if your data is subject to quality assurance, this is a good place to include that information. 

Metadata is literally data about other data. Information included here ensures that the data can be discovered, accessed, read, and interpreted in the future. This includes detailed descriptions for collections or files, for example; where did the data come from? How can it be retrieved? etc. 

Are you going to use an existing metadata standard to describe the data? Or will you create something new. If so, you should describe it here. List any plans you have for collecting or creating this documentation and metadata. Documentation might include; 

  • Laboratory notebooks and experimental protocols
  • Questionnaires, codebooks, data dictionaries
  • Software syntax and output files
  • Information about equipment settings and instrument calibration
  • Database schema
  • Methodology reports
  • Provenance information about sources of secondary data

Compliance covers two specific ares; the intellectual property rights surrounding the data, and the ethical implications associated with it.

Information relating to intellectual property rights might include the names of the people who own the data, any copyright information associated with the data, the appropriate licence for reuse that you have chosen, any restrictions on reuse by third parties, and any embargoes in place on the data (a delay before publication of a journal article, or a pending patent for example).

Ethical considerations might include the details of consent required for collection, processing, preservation, and sharing of data. If the data needs to be anonymised, list what steps you will use here. If you are storing or transferring sensitive personal data, list what precautions you will take to ensure its security. 

Here you should include some information about where the data will be stored (including any physical storage), if the data will be regularly backed up, what the backup procedure is, and how the data will be recovered if needed. Also give an overview of the risks associated with the storage of the data and how these will be managed. This might include things like encryption, and access arrangements.

Information about how to store data securely can be found here.

Not all your data will need to be retained, shared, or preserved, so with particular reference to contractual, legal or regulatory requirements list what data will be included in the final dataset. You can also include information about the repository where the data will be preserved, the length of time the data will (or should) be kept beyond the life of the project, and any additional work that might be required to prepare the data for preservation. If you anticipate any future uses for the data or research opportunities, here is a good place to list them. 

If you plan on assigning an individual Digital Object Identifier to the data, mention it here. If you are planning to apply any conditions surrounding the sharing a reuse of the data list them here as well. It's also useful to include information about how the data will be shared and accessed if at all (Via a digital download from a repository? Via email with the principal investigator? Via physical access? etc.) and how it will be made discoverable.

Loading ...

Why you should have a Data Management Plan

References

With thanks to;

The University of Edinburgh (2015) Research Data MANTRA. Available at: http://datalib.edina.ac.uk/mantra/