Realization of FAIR Digital Objects
History of FAIR Digital Objects
The idea behind the FAIR Digital Object concepts is based on a paper by Robert Kahn and Robert Wilensky describing "A framework for distributed digital object services" [1] back in 2006. In this paper, the concept of an open infrastructure of repositories offering storage of and access to digital objects identified by a globally unique identifier and connected to metadata, whereas an invariant parts of the metadata is handled as key-metadata, has been presented.
With the foundation of the Research Data Alliance (RDA) in 2013, interest in this idea was reawakened and led to the formation of different interest and working groups, which worked collaboratively on refining the concept and sought consensus on different aspects of the FAIR Digital Object concept. In the following years, several recommendations were published by RDA Working Groups, e.g., on Data Types and Data Type Registries [2], on PID Information Types [3], and on PID Kernel Information [4].
Simultanously, in the Data Fabric Interest Group, a constant reconciliation of the produced outputs against the final vision took place. In 2017, the Robust PID Testbed (RPID) [5] was started with a three-years funding to implement the FAIR Digital Object concept as a whole the first time and to prove its applicability for different use cases. As a result, two main conclusions could be drawn:
The FAIR Digital Object concept is ready for implementation.
Future implementations should happen on a long-term basis to ensure broad uptake and sustainability of results.
FAIR Digital Objects in HMC
With the start of HMC in 2019, also such a long-term basis for implementing FAIR Digital Objects arose. With its breadth of content by addressing six different research areas and its potential to be implemented as long-term platform, HMC is an ideal environment to lift FAIR Digital Objects on a new level.
Based on the tremendous job done by internationl research data management experts, backed by international consensus found for different aspects of FAIR Digital Objects, HMC could base its work on a strong foundation. However, establishing FAIR Digital Objects in a sustainable way required and still requires a lot of work in different directions.
Until today, in early 2023, mandatory base services, e.g., the Typed PID Maker based on the PIT Service developed in the scope of the RPID testbed, have been put on a sustainable basis, basic workflows and guidelines have been developed and described, e.g., in the FAIR DO Cookbook, and the work on supporting tools making FAIR Digital Objects tangible for different stakeholders, e.g., FAIR-DOscope, has started.
A major breakthrough was achieved by finding consensus between all HMC Hubs on a basic Kernel Information Profile for all Digital Objects created withing the Helmholtz Association. Based on the Helmholtz Kernel Information Profile, which already could prove its applicability in a demonstrator allowing to represent arbitrary Zenodo records as FAIR Digital Objects by applying a semi-automatic mapping of existing metadata, HMC can now start to bring FAIR Digital Objects into practice and to apply them to a growing number of existing and new (digital) objects, which opens up new possibilities to interact with them in novel ways supported by machines and in a larger scale than ever before.
Anatomy of a FAIR Digital Object
A FAIR Digital Object is identified by a Persistent Identifier (PID) resolving to a machine-readable PID Information Record which contains kernel metadata, i.e., metadata that can be used by machines to make decisions on relevance or interpretability of the contents represented by a particular FAIR Digital Object.
Kernel metadata is assumed to be stored as key-value-pairs in the PID Information Record, whereas the key is a PID referring to the machine-readable type definition for the associated value. Which kernel metadata entries are supported for a certain FAIR Digital Object as well as their types are defined by a Kernel Information Profile (KIP), which is a special kind of type also identified by PID referring to a machine-readable representation of the KIP.
Following these basic principles, the Helmholtz Kernel Information Profile presented in the following figure has been agreed on within HMC.
The kernel information that is depicted here is only a subset of all kernel information available for the Helmholtz Kernel Information Profile. In addition, the figure shows Domain-Extensions, which may extend the base KIP by additional domain-specific properties allowing machine-actionability on a different level.
In the last years, besides the "traditional" approach of utilizing PIDs and machine-readable types, also alternative ways for realizing FAIR Digital Objects based on Linked Data principles are discussed. There the advantage is, that one can build on well-established Web technologies while using existing ontologies for semantic description of digital content. However, these approaches always require adaptation of existing infrastructures, which makes them hard to be broadly adopted. In contrast, the "traditional" approach can be applied two ways: in a non-invasive way on top of existing infrastructures referring to existing digital content, or in an integrated way, where the infrastructure is filling PID Information Records on its own.
Sample FAIR Digital Objects
In the following you'll find a collection of existing FAIR Digital Object created in different contexts. In the table, only the top-level FAIR Digital Objects are listed, which refer to a bigger number of related FAIR Digital Objects, e.g, holding additional metadata.
You may go though them, see their different representations, but you may also use them to evaluate on how to interact with them using novel tools.
Title | Context | Source | FAIR-DO Links |
X-ray computed tomography dataset of a walnut: scan (Reconstruction, 5 Children) | Image reconstruction, X-ray CT | ||
X-ray computed tomography dataset of a walnut: scan (Aquisition, 5 Children) | Image aquisition, X-ray CT | ||
Thermal Bridged on Building Rooftops (6 Children) | drone images | ||
NFFA-EUROPE - SEM Dataset (2 Children) | SEM images, machine learning | ||
FAIR Digital Object Demonstrators 2021 | Publication |