3. How to share data
On the face of it sharing data can be as simple as making your database or spreadsheet available via a convenient website. Whilst this may achieve the goal of making the data accessible, though, it may not necessarily be in the best interests of the research overall.
In considering the sharability of your work the nature of the 'data' and its condition should be taken into account. In this video Dr Julian Haseldine (University of Hull) discusses his database which is constantly in flux and what he would do if he did come to a time when he thought it useful to share it.
There are three areas that merit attention when planning to share data.
- Issues to consider prior to sharing data
- Where to make the data shareable from
- Who can assist with sharing data
Each of these will be addressed in turn.
Issues to consider prior to sharing data
- Will all the data be shared, or a specific subset?
- Is the data stored in a sustainable way? Websites frequently get taken down for one reason or another; could the data be better stored elsewhere?
- Are there ethical or rights issues that need to be addressed prior to the data being shared?
- Is it in a format that others can easily make use of, both now and in the future? If the data is in a proprietary format, are the reasons for using this made clear?
- Is there any documentation or metadata that explains the data to other researchers to facilitate their interpretation? Are contact details made clear so others can raise queries about the data?
Each of these issues is addressed in sections of the previous module. If you are unsure of how to address any of them, re-visit these sections for further information.
Managing data has implicit costs associated with it (e.g., IT, staff time), and sharing data can continue this beyond the period of the research. How will these costs be managed?
Even if you are unsure of the precise costs involved, it is important to consider what actions will be required to enable the data to be shared, and discuss these with the relevant support services so they are aware of the requirements and can plan accordingly. Many costs can be built into regular running costs in this way, and the process will highlight any specific additional costs that may be incurred.
When sharing data it will of benefit to be able to cite the data, both to make it easily accessible and to provide an identifiable link to the data. Such a citation requires that there be a persistent link to the data using an identifier or a link to a website describing the data. This identifier can be part of the metadata for the data, and may be assigned when using a repository to store the data for sharing.
In the first example, the DOI (or digital object identifier) is assigned by a national agency on behalf of DataCite. In the UK, this is the British Library. This identifier can be issued through your institution if they are registered for the service, or may be assigned by a subject repository as part of their deposit process.
Consider you are starting a research project from scratch, and find some existing data that could help you. What characteristics would you like this data to have to support your own use of it?
- The data is accessible from a known and trusted source
- The data is in a format that can be re-used without modification or the acquisition of specialist software
- The data is well-described and can be easily interpreted
- The data has a clear statement of how any ethical or data protection issues have been addressed
- The data is cited and has clear provenance so queries can be followed up
- The data is clearly licensed to clarify how it can be used
Licensing data is addressed in further detail in a later section of this module.
When sharing data there are two roles involved: the one sharing the data and the one using the data. Always consider the other role from your perspective.