Lesson plan 16: Data management and governance in industry and research
FAIR elements: All
Primary audience(s):
This lesson serves to deliver a concise overview of the data management and governance (DMG) practices in research and industry for master students or professional audiences of vocational education and training, primarily with a computer or information science background.
Learning outcomes:
Understand the enterprise data management and governance process and main use cases according to the DAMA (Data Management Association) Data Management Body of Knowledge (DMBOK)
Understand the European data spaces concept and initiatives, European policies and regulations, GDPR (General Data Protection Regulation)
Understand elements of the enterprise data management infrastructure and services: Data warehouses, cloud-based storage, data lakes
Understand data modelling processes, data models, and data structures. Master data management
Understand FAIR principles in research data management and their applicability to industrial use cases
Understand data management maturity frameworks and best practices
Understand what a data management plan is, its purpose and benefits for a project or organisation
Apply the acquired knowledge in practice, namely be able to create a DMP and assess organisational data security and compliance
Understand the key organisational roles in DMG: Chief Data Officer, Data Steward, Data Protection Officer and other roles
Delivery format:
This lesson can be delivered in the form of lectures and practice, a tutorial or self-paced, self-study course.
Suggested time: 2 lecture sessions (1.5 hrs each) and 1 practice session (approx 1.5 hrs).
Prerequisites:
Basic knowledge of computer software and applications.
Understanding of organisational processes (HR/staff, customers, products, shipments, orders, etc.) and data used or produced.
Basic understanding of SQL for the advanced course.
Lesson topics (Summary of tasks/actions):
The DMG course uses DAMA DMBOK as a general framework covering the majority of topics, extending them with data science and big data analytics platforms and enriching them with FAIR and industry best practices. The following main topics should be included in the course:
Introduction. Big data infrastructure and data management and governance. European data spaces: definitions use cases. European policy on data governance, data protection, GDPR
Data management concepts. Data management frameworks: DAMA data management framework, the Amsterdam Information Model (AIM). Extensions for big data and data science
Enterprise data architecture. Data lifecycle management and service delivery model. Data management and data governance activities and roles
Data science professional profiles and organisational roles, skills management and capacity building
Data architecture, data modelling and design. Data types and data models. Metadata. SQL and NoSQL databases overview. Distributed systems: CAP theorem, ACID and BASE properties
Enterprise big data infrastructure and integration with enterprise IT infrastructure. Data warehouses. Distributed file systems and data storage
Big data storage and platforms. Cloud-based data storage services: data object storage, data blob storage, data lakes (services by AWS, Azure, GCP)
Trusted storage, blockchain-enabled data provenance
FAIR data principles and data stewardship, FAIR digital object and persistent identifier (PID)
Data repositories, Open Data services, public services
Data quality assessment. Data management maturity frameworks: DNV-GL data quality framework, DCC RISE, CIMM, etc.
Big data security and compliance. Data security and data protection. Security of outsourced data storage. Cloud security and compliance standards and cloud provider services assessment
Practice:
Hands-on practice including the following topics:
Data management plan design, templates and tools
Metadata and tools, metadata registries
Assessing an organisation's data security and compliance requirements
Advanced: Data modelling, relational data model creation
Materials/Equipment
Collection of DMP templates
Example metadata for research data and publications
Collection of links to enterprise data management and governance practices and recommendations
References
DAMA Data Management Body of Knowledge (DMBOK), DAMA International, 2017
GO FAIR Initiative [online] – https://www.go-fair.org/go-fair-initiative/
General Data Protection Regulation – https://eur-lex.europa.eu/eli/reg/2016/679/oj
DMP Templates – https://guides.lib.umich.edu/c.php?g=283277&p=2138498
Towards FAIR principles for research software – https://doi.org/10.3233/DS-190026
A European strategy for data COM(2020) 66 final, 19.02.2020 – https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A52020DC0066
European Data Governance Act https://ec.europa.eu/digital-single-market/en/european-data-governance
EU/Parliament Regulation on European data governance (Data Governance Act) SEC(2020) 405 final, Nov 2020 – https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A52020PC0767
GAIA-X – A Federated Data Infrastructure for Europe – https://www.gaia-x.eu/ 1- FAIR Cookbook, developed by Life Sciences academics and pharmas, 2021 – https://w3id.org/faircookbook
Take-home task
Organisational data management plan creation (using the provided template and/or online tools)
Last updated