Return to ENVRI Community Home![]()
The implementation case aims at fulfilling requirements for curation, cataloguing and provenance.
The targeted usages are:
Catalogue is used for discovery (finding items of interest), contextualisation (determining relevance and quality) , access (connecting together users, datasets, software, resources to achieve the user end-objective).
Items described in catalogues are among: datasets, systems and resources for observation and processing, observations event and results (e.g. samples), documents, persons, research objects.
Provenance and curation functions rely on catalogue as a back-end repository, as input or output
Provenance relates to contextualisation. It provides functions writing, updating and reading catalogue to complete discovery and access with services determining relevance and quality of the items described in catalogues.
Provenance being well covered in other implemention cases (IC_2 mostly, but IC_6 and IC_9 as well), the current implementation case will collaborate with them for requirements and fulfill them so to demonstrate a couple of provenance functions: to be listed (Barbara) 2 functions related to dataset's provenance.
Curation relates to the data management processes required to ensure availability of digital assets (datasets, software) through media migration to ensure physical readability, redundant copies to ensure availability, appropriate security and privacy measures to ensure reliability and appropriate catalogue content maintenance to ensure discovery, contextualisation and access to this digital assets.
The current implementation case fulfill requirements for a couple of curation functions: (a) automated media migration of datasets to ensure continued availability and readability; (b) discovery of a curated dataset along with appropriate curated software and operating environment
Background |
Contact Person |
Organization |
Contact email |
|||
_<Choose one of the following roles: [RI-ICT |
RI-Domain |
ICT |
e-Infrastructure]>_ |
<Full name> |
<Organization of the contact person> |
<Email> |
RI (Use Case proposer, Agile Group leader) |
Thomas Loubrieu |
IFREMER |
||||
RI |
Keith Jeffery |
|
||||
RI |
Chrstian Pichot, Andre Chanzy |
INRA (ANAEE) |
||||
ITC |
Marco Rorro Giovanni Morelli The persons who managed CKAN for EUDAT would be perfect here ! |
EUDAT |
||||
RI |
Damien Boulanger |
IAGOS |
||||
RI |
Maggie Hellstrom |
ICOS |
||||
RI |
Barbara Magagna, Johannes Peterseil |
LTER |
Barbara.magagna@umweltbundesamt.at Johannes.peterseil@umweltbundesamt.at |
Implementation case
Conditions:
Implications:
To be relevant in ENVRIPLUS context, the implementated functions must be validated by at least 2 RIs, preferably in 2 different spheres (bio, liquid, solid, gas):
The connected behaviours are:
Data acquisition community:
Data Service provision community:
Catalogue
The catalogue aims at providing functions cross-cutting RI, to edit and discover the following items:
Action 1: Persons and documents will be described and federated in pre-existing e-infrastructures, to be defined (e.g. orcID, …) so to fulfill requirements for the provenance and curation functions.
Action 2: Datasets description will be federated from harvesting the datasets catalogue (in whatever 'standard' metadata format) in each RI in a single entry point (metadata format to be chosen among: DC, DCAT, INSPIRE/ISO19115, geonetworks, CKAN, CERIF ) to be defined so to fulfill requirements for the provenance and curation functions.
Action 3: Observation systems, events and results (including collected samples) edition and discovery functions will be implemented by a combination of RI specific tools and federated tools (e.g. for edition) so to fulfill requirements for the provenance and curation functions.
The main challenge is the involvement of RI, from definition of the functions to the adoption of the solution.
In the context of the 3 above actions:
As for AGILE, the steps can be iterative by having new iteration for new requirements identified or RI supported.
E-infrastructures which manage catalogues of persons and documents are existing, available through standard interfaces and cross-cutting RI.
Catalogues of datasets are generally provided by RI and their content is available through standard interfaces. Some tools are available on the shelf to implement the catalogue of datasets (DC, DCAT, INSPIRE/ISO19115, geonetworks, CKAN, CERIF). ENVRIPLUS need to federate them by utilising the richest available 'standard' and providing mappings to the others.
Catalogue of observation systems, events or samples may exists in RI. They are seldom or never accessible through standard interfaces. Some RI lack proper tools to manage these information which is however critical for the good quality and traceability of scientific results.
Documents and persons
E-infrastructures which manage catalogues of persons and documents are existing.
The implementation case will define a list of official sustainable person and document repository which should be used by RI to describe their resources. and define mappings to/from the ENVRIPLUS catalogue metadata standard (when chosen)
Expected result in Octobre 2016
Datasets
The implementation case will identify catalogues of datasets in RI and analyse their machine to machine interface for harvesting purpose. A single tool will harvest them centrally. Then their metadata will require conversion from local RI format to that of the ENVRIPLUS central catalogue as described above.
Expected result in Octobre 2017
Observation systems, events or samples
An integrated system will shows observations systems, events and collected samples from 2 or 3 RI in liquid (EMSO, ARGO), solid (EPOS) and gas (ICOS) spheres.
Tools will be provided to easily edit the descriptions for RI which would not have their own system yet.
As before this will rquire mapping the metadata describing systems, events, samples at each RI to the common metadata standard of ENVRIPLUS.
Expected result in Octobre 2018
Documents and persons
number of RI actually using the chosen person and document e-infrastructure to identify their resources.
Datasets
Number of RI which dataset results descriptions are available in the federated system.
Number of users of the federated dataset catalogue (inside or outside the RI).
Observation systems, events or samples
Number of observation systems which events and results are actually available in the federated catalogue.
Number of users of the catalogues as support of the activities in the RI.