Return to ENVRI Community Home![]()
...
Reductionism aside however, the key performance indicator used by most RIs is researcher productivity. Can researchers use the RI to efficiently locate the data they need? Do they have access to all the support available for processing the data and conducting their experiments? Can they replicate the cited results of their peers using the facilities provided? This raises yet another question: how does the service provided to researchers translate to requirements on data placement and infrastructure availability?
Good provenance is fundamental to optimisation—in order to be able to anticipate how data will be used by the community, and what infrastructure should be able conscripted to provide access to those data, it is necessary to understand as much about the data as possible. Provenance is required to answer who, what, where, when, why and how regarding the orgins of data, and the role of an optimised RI is to rknow the answers for who, what, where, when, why and how regarding the future use of data. Ensuring that these questions can be asked and answered becomes more challenging the greater the heterogeneity of the data being handled by the RI.
Quality checking of datasets is also important, in order to ensure that the data is fit for purpose.
Streamlining the acquisition of data from data providers is important to many RIs, both to maximise the range and timeliness of datasets then made available to researchers, and to increase data security (by ensuring that it is properly curated with minimal delay).
The efficient exploitation of RI infrastructure, in terms of data transportation, placement, and the serving of computing resources, requires knowledge about the RI and its assets. This knowledge is usually embedded in the technical experts assigned to manage the infrastructure; however the ability for infrastructure services to acquire the knowledge to manage themselves (even if only to the extent of provisioning resources on cloud infrastructure to support static resources) would allow for greater flexibility and agility in RI composition.
The following RIs contributed to developing optimisation requirements:
Euro-Argo: This RI is interested in: providing full contextual information for all of its datasets (who, what, where, when, why, how); local replication of datasets (to make processing more efficient); cloud replication of the Copernicus marine service in-situ data in order to make it more accessible to the marine research community.
...