Lesson plan 6: Metadata
FAIR elements:
Findable
The first step in (re)using data is to find them. Metadata and data should be easy to find for both humans and computers. Machine-readable metadata are essential for automatic discovery of datasets and services, so this is an essential component of the FAIRification process.
F1. (Meta)data are assigned a globally unique and persistent identifier
F2. Data are described with rich metadata (defined by R1 below)
F3. Metadata clearly and explicitly include the identifier of the data they describe
F4. (Meta)data are registered or indexed in a searchable resource
Accessible
Once the user finds the required data, she/he/they need to know how they can be accessed, possibly including authentication and authorisation.
A1. (Meta)data are retrievable by their identifier using a standardised communications protocol
A1.1. The protocol is open, free, and universally implementable
A1.2. The protocol allows for an authentication and authorisation procedure, where necessary
A2. Metadata are accessible, even when the data are no longer available
Interoperable
The data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing.
I2. (Meta)data use vocabularies that follow FAIR principles
I3. (Meta)data include qualified references to other (meta)data
Reusable
The ultimate goal of FAIR is to optimise the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings.
R1. (Meta)data are richly described with a plurality of accurate and relevant attributes
R1.1. (Meta)data are released with a clear and accessible data usage license
R1.2. (Meta)data are associated with detailed provenance
R1.3. (Meta)data meet domain-relevant community standards
Primary audience(s): Bachelor's, master's, PhD degree students
Learning outcomes:
Can describe types of metadata
Can recognise metadata formats
Can identify metadata standards
Can use metadata standards to describe resources
Can explain what metadata registries are
Can search and find data and metadata standards in registries
Can articulate metadata of different types to describe a resource
Can write metadata in a relevant format
Can appraise the usefulness of metadata standards to describe a resource
Summary of tasks/actions:
Metadata are 'data about data'
Present and describe the different types of metadata (can present the whole list, or pick specific elements relevant to your audience).
Metadata are:
standardised
structured
machine- and human-readable
a subset of documentation
Documentation (descriptive and/or technical info)
Controlled vocabularies and ontologies
Persistent identifiers (PIDs)
Licences
Learn syntax of example metadata standards:
Dublin Core is general and applicable to all datasets on a project level; on a data level there are discipline-specific standards to branch into such as:
Data Documentation Initiative (DDI) – social science
Ecological Metadata Language (EML) – ecology
Flexible Image Transport System (FITS) – astronomy
Minimum information standards
Use metadata catalogues/registries and search for suitable standards
Metadata form the core of machine- and human-readable descriptions of data, be they technical information or annotations, and cover all aspects of the FAIR principles. Metadata is an umbrella term that includes file formats, ontologies and licences, and documentation in general. For each of the principles, metadata can be used at different granularities and domain specificity, with more general metadata not providing as much usefulness and value to the underlying data than domain-specific metadata.
References:
Metadata for Machines workshops
General information: https://www.go-fair.org/how-to-go-fair/metadata-for-machines/
Example: Metadata for Machines workshops, including material. These were funded by the Dutch research foundation ZonMw in support of their COVID-19 research programme: https://osf.io/bhzf8/
FAIR Cookbook, recipes for hands-on FAIRifications in the Life Sciences.
FAIRsharing resource to discover (meta)data standards (and which repositories implement them)
Take-home tasks:
Create the metadata for a dataset
Search for standards in catalogues like:
How to create a metadata profile or template
Encode the data in a dataset using controlled vocabularies/ontologies
Jacob et al. Making experimental data tables in the life sciences more FAIR: a pragmatic approachGigaScience, Volume 9, Issue 12 2020
Exercises:
Last updated