Sharing Data

Site: Postgraduate online research training
Course: Module 3: During the Research
Book: Sharing Data
Printed by: Guest user
Date: Saturday, 5 December 2020, 11:35 AM

Description

Sharing data

1, Introduction

In the past (and often still today) historians tend to work alone, creating research data for themselves and only making publically available the end results in the form of a publication of some type or another. This is increasingly changing, especially with new rules emerging on Open Access. Now is the time to consider how you might share your data during or after your research projects.

It's something that you should seriously consider as your data is valuable in its own right (and is increasingly being seen as valuable in its own right by research funders, government, and higher education institutions). Why reinvent the wheel when someone else has already done the work? Why not build on each others work? This can only be done through sharing data.

In essence we all gain from doing this. By the end of this section of the course you will be able to:

  • Recognise why data benefits from being shared
  • Appreciate the value of sharing data and how this can support your own research
  • Explain the options and requirements for sharing data

Throughout your research project you should consider your data as a commodity in its own right. It's not just a means to an end. It is a product of your research and therefore important.

2. Why share data?

The UK Data Archive describes data as a valuable resource.  The value of data has been known for many years, but this has been rarely recognised.  The value of research outputs has centred on publications, with data the raw material that informs these.  The costs of producing data and the effort involved are now, though, being more widely acknowledged and there is encouragement to share these valuable resources to increase their value to researchers generally.

The reasons why data should be shared fall into three areas.

Better research

  • Demonstrates research integrity, as there is transparency and accountability in the production of the data being released
  • Encourages research enquiry and debate
  • Promotes innovation and potential new data uses
  • Encourages the improvement of research methods
  • Prevents research fraud

 Better impact

  • Enables peer scrutiny of the research findings, validating the work carried out
  • Increases the visibility of the research
  • Provides credit for the creation of the data in its own right
  • Can lead to new collaborations
  • Produces a public record of the research

 Better value

  • Avoids duplication of effort in data creation
  • Provides resources for use in teaching and learning
  • Meets funder requirements
  • Ensures data can be re-visited for future research
  • Maximises return on research investment
  • Preparing data for sharing also prepares it well for preservation

The developing trend to share data has been driven in part by funders making this a requirement to demonstrate a better return on investment, but has also equally been informed by a wider trend toward openness in research, particularly where its creation is publicly-funded.  In their response to a Royal Society investigation into opening up scientific information (Science as an Open Enterprise, 2012) the British Academy expanded the debate to all areas of research (see their 'response' here).

“We believe that in principle, and across all subjects, data collected and held by government and public bodies should be made available to other researchers in order that they can assess, test and challenge research findings, or conduct additional research using these data.”

In the same response the view as also taken that complete openness may not be appropriate.  There are a number of reasons why data might not be shared, some stronger than others.  Key reasons are:

Financial

  • The institution holding the data may wish to commercially exploit the data produced.  Where there is a likelihood for commercialisation this needs to be checked with the local Enterprise Office.

Confidentiality

  • The data may contain information about people who have not given their consent for the data to be shared.  Application of the local data protection policy will inform how this issue should be addressed.

Ownership/IPR

  • Sharing of data should only be undertaken if the researcher has the appropriate rights to share.

Other reasons, which reflect concerns more than specific barriers, are listed on the attached document from the UK Data Archive

File: UKDA Reasons Not to Share exercise.pdf

 

Exercise 1

List the top 5 reasons/benefits why you would look to share your data?  Then list any concerns you have about sharing your data, and any barriers you feel will prevent you from sharing. 

It is important that as part of any data management plan that you set out your plans to share or not share, and the reasons why you have taken this decision.  This will demonstrate transparency in your research and avoid misinterpretation of your choice.

 

Exercise 2

Review your concerns/barriers against the UKDA list (see file below), which highlight how these can be addressed.

File: UKDA Reasons not to Share exercise 2.pdf

Research Councils UK has recognised specifically the value of sharing publicly-funded research data through its Common Principles on Data.  Whilst strongly supporting the sharing of data, the 5th principle additionally recognises the right of the researcher to retain exclusive access to the data for a period of time following its creation to support publication of research findings – recognition itself of the development effort.

2.1 The Impact of sharing data

The ADS have conducted research into three data centres as a means of determining the impact sharing data can have on the person sharing that data, the one using that data, and the discipline in general. As the basis of their analysis they looked at the Economic and Social Data Service (ESDS), the Archaeology Data Service (ADS), and the British Atmospheric Data Centre (BADC). Obviously not all of these apply to historians, but their findings are nonetheless worthwhile thinking about.

Quantitative economic analysis:

1. The value to users exceeds the investment made in data sharing and curation via the centres in all three cases.

2. Very significant increases in work efficiency are realised by users as a result of their use of the data centres

3. By facilitating additional use, the data centres significanly increase the returns on investment in the creation/collection of the data hosted.

Qualitative analysis:

1. Academic users report that the centres are very or extremely important for their research and that there would be a major or severe impact on their work if they could not access them.

2. For depositors, having the data preserved for the long-term and its dissemination being targeted to the academic community are seen as the most beneficial aspects of depositing data with the centres.

For the full report see here: N. Beagrie and J.W. Houghton, The Value and Impact of Data Sharing and Curation: A synthesis of three recent studies of UK research data centres, Jisc (2014). 

3. How to share data

On the face of it sharing data can be as simple as making your database or spreadsheet available via a convenient website.  Whilst this may achieve the goal of making the data accessible, though, it may not necessarily be in the best interests of the research overall.

In considering the sharability of your work the nature of the 'data' and its condition should be taken into account. In this video Dr Julian Haseldine (University of Hull) discusses his database which is constantly in flux and what he would do if he did come to a time when he thought it useful to share it.

 

There are three areas that merit attention when planning to share data.

  •  Issues to consider prior to sharing data
  • Where to make the data shareable from
  • Who can assist with sharing data

Each of these will be addressed in turn.

 

Issues to consider prior to sharing data

  • Will all the data be shared, or a specific subset?
  • Is the data stored in a sustainable way?  Websites frequently get taken down for one reason or another; could the data be better stored elsewhere?
  • Are there ethical or rights issues that need to be addressed prior to the data being shared?
  • Is it in a format that others can easily make use of, both now and in the future?  If the data is in a proprietary format, are the reasons for using this made clear?
  • Is there any documentation or metadata that explains the data to other researchers to facilitate their interpretation?  Are contact details made clear so others can raise queries about the data?

Each of these issues is addressed in sections of the previous module.  If you are unsure of how to address any of them, re-visit these sections for further information.

Managing data has implicit costs associated with it (e.g., IT, staff time), and sharing data can continue this beyond the period of the research.  How will these costs be managed?

Even if you are unsure of the precise costs involved, it is important to consider what actions will be required to enable the data to be shared, and discuss these with the relevant support services so they are aware of the requirements and can plan accordingly.  Many costs can be built into regular running costs in this way, and the process will highlight any specific additional costs that may be incurred.

When sharing data it will of benefit to be able to cite the data, both to make it easily accessible and to provide an identifiable link to the data.  Such a citation requires that there be a persistent link to the data using an identifier or a link to a website describing the data.  This identifier can be part of the metadata for the data, and may be assigned when using a repository to store the data for sharing.

 

Examples:

Peter Fransen, Transport of Goods via Harbours and Railway in Funen, 1865-1920, Dansk Data Arkiv, 1996. 1 data file: DDA-2574, version: 1.0.0.

N. Klaer ed. 'South East Australian Trawl Records, 1937-1943' in M.G Barnard & J.H Nicholls (comp.) HMAP Data Pages 

In the first example, the DOI (or digital object identifier) is assigned by a national agency on behalf of DataCite.  In the UK, this is the British Library.   This identifier can be issued through your institution if they are registered for the service, or may be assigned by a subject repository as part of their deposit process.

 

Exercise

Consider you are starting a research project from scratch, and find some existing data that could help you.  What characteristics would you like this data to have to support your own use of it?

Suggested answers (click here to reveal)

Licensing data is addressed in further detail in a later section of this module.

 

REMEMBER

When sharing data there are two roles involved: the one sharing the data and the one using the data.  Always consider the other role from your perspective.

4. Where to share data

When choosing a potential location for sharing data, the following criteria can be applied:

  • Will the data be stored securely and sustainably?
  • Will the data be accessible by other history researchers?
  • Will the data be readily accessible via the Internet
  • Will the data be identifiable using a persistent link?
  • Will the data be usable as supplementary material for associated publications?

The following locations can be used to share data.  

  • Deposit in an institutional repository
    • Many institutions have institutional digital repositories, and many of these are being used for holding research data and sharing this
  • Deposit in a specialist data centre or archive
    • The UK Data Archive has an established reputation for managing history datasets
  • Submitting to a journal to support a publication
    • Many journal publishers are now providing the scope to share data associated with publications
  • Dissemination via a project or institutional website
    • As mentioned at the start of this section, a simple option
  • Informal peer-to-peer exchange
    • An option commonly used by many areas of research

These options can be compared as follows:

 

Location

Secure and sustainable storage

Accessible to history researchers

Accessible to the Internet

Identifiable via link

Supplementary material use

Institutional repository

✓✓✓

✓✓

✓✓✓

✓✓✓

✓✓✓

Subject archive

✓✓✓

✓✓✓

✓✓

✓✓✓

✓✓✓

Journal

✓✓

✓✓✓

✓✓✓

✓✓

✓✓✓

Website

✓✓

✓✓

Peer-to-peer

✓✓

✓✓

 

 

5. Who can assist with sharing data?

Sharing data is not, and should not be, the sole responsibility of the researcher, as others can support many of the issues related to sharing data.

In the list below, pair the roles within an institution with the activities they can assist with when sharing data.  NB.  It is recognised that not all these roles may exist in all institutions in this form, and the list is intended as indicative, not prescriptive.

  

Role

Area of activity

Principal investigator

Specifies the purpose and scope of sharing data from research

Research assistants

Assist with organising and structuring the data so it is suitable for sharing

Research Office

Guidance on sharing data overall

Research ethics committee

Guidance on ensuring data is suitable for sharing

Legal Office

Guidance on appropriate licensing options

IT staff

Assist with storage, security and back-up of data

Library staff

Assist with metadata and documentation

Repository staff (internal or external)

Assist with organisation and presentation of data for sharing, plus file format use

 

Exercise

List the support services you have at your institution that may be able to assist with sharing data. 

Follow-up: As History researchers, the UK Data Archive offers further advice and guidance on sharing data, and a trusted location where data can be archived and shared.  See the UK Data Archive for more information.