FAIR Data Point

What is a FAIR Data Point

A FAIR Data Point (sometimes abbreviated to FDP) is the realisation of the vision of a group of authors of the original paper on FAIR on how (meta)data could be presented on the web using existing standards, and without the need of APIs.

A FAIR Data Point ultimately stores information about data sets, which is the definition of metadata. And just like the webserver in the WWW in the beginning of the 1990s brought the power of publishing text to anyone, a FAIR data point aims to give anyone the power of putting their own data on the web.

The system is called a FAIR data point because it takes care of a lot of the issues that need to be taken care of to make data FAIR; especially with the metadata needed for Findability and Reusability, and a uniform open way of Accessing the data. The FAIR data point also addresses the Interoperability of the metadata it stores, but it leaves the Interoperability aspects for the data itself to the data provider.

Components

The FAIR Data Point as we have implemented it has three components.

The first and most important component is a definition of the FAIR Data Point “API” specification. It is completely based on semantic metadata standards (mostly Data Catalog (DCAT) and Dublin Core, combined with the Linked Data Platform (LDP)), and based on the REST philosophy. This combination targets the the highest possible technical interoperability in combination with a relatively simple implementation. The FAIR data point protocol comes with a tool that can be used to test the compliance of an implementation.

The second component is a reference implementation of the metadata registration service: A service implementing the API specification. It contains an authentication system to allow maintainers to define and update metadata. Read-only access to the data is public.

The third component is a client of the API: a web front end that can be used to add and edit the information in the metadata registration service, or to query it. The editor generates forms using Data Shapes (DASH) and contains a simple validation of data types based on the Shapes Constraint Language (SHACL). It facilitates the creation of metadata profiles as well as filling them, but this is meant to support the FAIR data point, and not meant to replace complete metadata profile development systems like CEDAR.

The priorities for the development of all three components together are set using input from an Advisory Board.

Examples of use

Several projects are under way that will use FAIR Data Points to make data sets known to other researchers. Some examples:

The VODAN project installs FAIR data points in several different (firstly African) countries, and uses these to collect information on COVID-19 patients.
The FAIR Data Train uses FAIR data points in FAIR Data Stations to as the metadata provisioning components. This is a generalisation of the concepts first built into the Personal Health Train for distributed analysis of data.
Dutch academic hospitals will be implementing FAIR Data Points to collect COVID-19 data too, with the primary aim of reducing the maintenance burden of several Covid-19 data portals.
Health-RI is implementing their health data catalogue using FAIR Data Points.
The Genomic Data Infrastructure (GDI) project is exchanging metadata with a central catalogue using FDP protocols.

Networking FAIR Data Points together

The FAIR Data Point protocol contains a component to notify a client of updates to its data. This ping system allows for the creation of networks of FAIR Data points that can be queried as a single unit. You can see a first instance of this at work in the FAIR Data Point HOME Server. This functionality will still be extended.

Protocol implementations

There are several FAIR Data Point implementations in existence:

The reference implementation as described above.
The MOLGENIS software supports the FAIR Data Point protocol.
Castor currently supports a previous version.
The Netherlands eScience Center have their own implementation.
The SURF Data Repository implements the protocol.
RD-Nexus implements the protocol.
LOVD implements the protocol.

There are also different implementations of the metadata harvesting protocol:

The reference implementation as described above.
A plugin for CKAN was created to harvest FAIR Data Points.