페이지 트리

버전 비교

  • 이 줄이 추가되었습니다.
  • 이 줄이 삭제되었습니다.
  • 서식이 변경되었습니다.

Introduction

<Insert here a brief introduction to this topic>

<Introduction to the questions asked pertaining to general / pervasive requirements and setting the context of topic-­specific requirements. Collation and integration of any pertinent properties and requirements that are consistent across all of the research infrastructures addressed by ENVRIplus requirements gathering.>

<insert here who is responsible for steering and editing this page. But they need to get their go-betweens to agree they have covered the points, e.g. for General requirements>

Image Removed with help from go betweens and others he co-­optsEnvironmental science now relies on the acquisition of great quantities of data from a range of sources. That data might be consolidated into a few very large datasets, or dispersed across many smaller datasets; the data may be ingested in batch or accumulated over a prolonged period. To use this wealth of data effectively, it is important that the data is both optimally distributed across a research infrastructure's data stores, and carefully characterised to permit easy retrieval based on a range of parameters. It is also important that experiments conducted on the data can be easily compartmentalised so that individual processing tasks can be parallelised and executed close to the data itself, so as to optimise use of resources and provide swift results for investigators.

We are concerned here with the gathering and scrutiny of requirements for optimisation. More pragmatically, we are concerned with how we might develop generically applicable methods by which to optimise the research output of environmental science research infrastructures, based on the needs and ambitions of the infrastructures surveyed in the early part of the ENVRI+ project.

Perhaps moreso than the other topics, optimisation requirements are driven by the specific requirements of those other topics, particularly processing, since the intention is to address specific technical challenges in need of refined solutions, albeit implemented in a way that can be generalised to more than one infrastructure. For each part of an infrastructure in need for improvement, we must consider:

  • What does it mean for this part to be optimal?
  • How is optimality measured—do relevant metrics already exist as standard?
  • How is optimality achieved—is it simply a matter of more resources, better machines, or is there need for a fundamental rethink of approach?
  • What can and cannot be sacrificed for the sake of 'optimality'? For example, it may be undesirable to sacrifice ease-of-use for a modest increase in the speed at which experiments can be executed.

More specifically, we want to focus on certain practical and broadly universal technical concerns:

  • What bottlenecks exist in the functionality of (for example) storage, access and delivery of data, data processing, and workflow management?
  • What are the current peak volumes for data access, storage and delivery for parts of the infrastructure?
  • What is the (computational) complexity of different data processing workflows?
  • What are the specific quality (of service, of experience) requirements for data handling, especially for real time data handling?

Optimisation gathering is coordinated by Image Added with help from go betweens.

Overview and summary of optimisation requirements

...