Lesson plan 13: Data access

Lesson plan 13: Data access

FAIR elements

Findable

The data access category should not influence the findability of data; all data should be findable irrespective of their access; the main thing is that the metadata should be openly accessible for data to be discoverable/findable.

F2. Data are described with rich metadata (defined by R1 below)

Accessible

Irrespective of the data access category selected, there should be clear information on how data can be accessed (described in the metadata), and the protocol should be open, free and universally implementable. If data access is restricted then an authentication protocol can be used.

A1. (Meta)data are retrievable by their identifier using a standardised communications protocol

A1.1. The protocol is open, free, and universally implementable

A1.2. The protocol allows for an authentication and authorisation procedure, where necessary

A2. Metadata are accessible, even when the data are no longer available

Interoperable

Open data are easier to use as linked data in an interoperable way, especially if available through an API. But interoperability may also require key identifiers to link separate datasets. If these identifiers can identify individual people, e.g. point coordinates of a house, social security number of a person, then access restrictions will be needed to allow such data to be linked.

I3. (Meta)data include qualified references to other (meta)data

Primary audience(s): Bachelor's, master's, PhD degree students

Learning outcomes:

  • Can state general requirements on data protection and access control

  • Understands the different access options that exist for data/digital resources

  • Understands the criteria that influence/define access conditions

  • Can apply strategies to decide which access level is suitable for their data

  • Can implement (alternative) research practices to achieve more open data

  • Recognises how access is important to make data FAIR (all 4 letters)

Summary of tasks/actions:

  1. Introduce your audience to the different access options that exist. Research data can be made available in data centres, data repositories, via an AP, or on the web, with a range of access options. While open access to data may be ideal, there can be genuine reasons why that is not possible. Data access categories (24, 25) can be:

    • Open access

    • Restricted access

    • Embargo

    • Closed access

    Open data can be defined as 'data that can be freely used, re-used and redistributed by anyone – subject only, at most, to the requirement to attribute and sharealike' (26). Access restrictions can require a contractual use agreement or data sharing agreement to be signed. Embargo means that access is closed temporarily. Closed access means that data are not accessible, except maybe to regulators.

  2. Explain the criteria that can influence access decisions (27):

    • Presence of personal information in the dataset which can be used to identify an individual

    • Sensitivity of information, where the release of the data can adversely affect

      • a person, e.g. information on political views, criminal activities;

      • biodiversity, e.g. the location of rare and endangered species;

      • a community, e.g. terrorism; and/or

      • commercial interests of a company.

    • Intellectual property, where early release of the data can adversely affect patents or valorisation routes

    • Confidentiality agreement, where access to and sharing of data is restricted to the contracting parties.

  3. Show how a suitable access level can be decided, for example, using a decision tree. Example: Data Sharing guidelines – WUR

  4. Explain that alternative research practices, or adaptations to research practices, could be used to enable more open data. Examples include the following:

    • Capture data in an anonymous way

    • Anonymise information in a dataset so individuals (people, animals, etc.) cannot be identified from the information they have contributed during the research

    • Gain permission from people to make data open, even if the data contain personal or sensitive information (informed consent)

    • Use citizen science and participatory research methods to co-create data that are then co-owned and can be released as open data

Materials/Equipment

  • Computer/laptop

  • Internet/browser

References

Take-home tasks

Do one of these exercises on data access:


Lesson plan 13: Additional material – data availability statements

The list below provides some example data availability statements. Please note that data access statements should be tailored to suit each publication, checking that they meet all funder and publisher requirements.

Statement type

Example statement

Openly available data

"All data underpinning this publication are openly available from the University of FAIR-Data Repository at http://doi.org/10.15000/a789457"

Embargoed data

"All data underpinning this publication will be available from the University of FAIR-Data Repository at http://doi.org/10.15002/a1234a56 from 01/02/2019 onwards, following the cessation of an embargo period."

Restricted data

"Due to ethical/commercial issues, data underpinning this publication cannot be made openly available. Further information about the data and conditions for access are available from the University of FAIR-Data Repository at http://doi.org/10.15000/a1234b56"

Partially restricted data

"Due to the sensitive nature of this research, only a subset of the participants consented to their anonymised data being retained and shared. Anonymised interview transcripts and survey results from participants who provided consent, other supporting data, and further details relating to the restricted data, are available from the University of FAIR-Data Repository at http://doi.org/10.15129/a1234b56"

Physical data

"Physical data supporting this publication are stored by the University of FAIR-Data. Details of the data and how they can be accessed are available from the University of FAIR-Data Repository at http://doi.org/10.15129/a1234b56"

Secondary data

"Pre-existing data underpinning this publication are openly available from UKDS at http://doi.org/10.12345/54321. Further information about data processing, and additional new supporting data are available from the University of FAIR-Data Repository at http://doi.org/10.15129/a1234b56"

No new data created

"No new data were created during this study. Pre-existing data underpinning this publication were obtained from NPL and are subject to licence restrictions. Full details on how these data were obtained are available in the documentation available from the University of FAIR-Data Repository at http://doi.org/10.15129/a1234b56"

No data

"This work is entirely theoretical, there is no data underpinning this publication."


(24) https://data.blogs.bristol.ac.uk/bootcampsd/repositories/

(25) https://www.cessda.eu/Training/Training-Resources/Library/Data-Management-Expert-Guide/6.-Archive-Publish/Publishing-with-CESSDA-archives/Access-categories

(26) https://opendatahandbook.org/guide/en/what-is-open-data/

(27) https://data.blogs.bristol.ac.uk/bootcampSD/what-counts/


Last updated