페이지 트리
메타 데이터의 끝으로 건너뛰기
메타 데이터의 시작으로 이동

이 페이지의 이전 버전을 보고 있습니다. 현재 버전 보기.

현재와 비교 페이지 이력 보기

« 이전 버전 8 다음 »


Short descriptionMarine CC (Task 8.3)
Type of community

Competence Centre

Community contactThierry Carval
InterviewerGergely Sipos
Date of interviewbetween Jan-July 2018

Ambition

The ocean experts are now converging in the estimation of integrated indicators such as global warming. However these indicators, based on interpolation of unevenly distributed observations, do not describe consistently the climate change. To better understand the ocean circulation and climate machinery, data scientists need to directly access the original observations otherwise diluted in spatial synthesis.

Original observations are published by Research Infrastructures (Argo, EMSO, ICOS…) and data aggregators (SeaDataNet, Copernicus Marine,…).

The Marine Competence Centre long term ambition is to push Ocean observations on EOSC infrastructure for data analytics.


User stories

Instruction

Requirements are based on a user story, which is  is an informal, natural language description of one or more features of a software system. User stories are often written from the perspective of an end user or user of a system. Depending on the community, user stories may be written by various stakeholders including clients, users, managers or development team members. They facilitate sensemaking and communication, that is, they help software teams organize their understanding of the system and its context. Please do not confuse user story with system requirements. A user story is an informal description of a feature; a requirement is a formal description of need (See section later).

User stories may follow one of several formats or templates. The most common would be:

"As a <role>, I want <capability> so that <receive benefit>"

"In order to <receive benefit> as a <role>, I want <goal/desire>"

"As <persona>, I want <what?> so that <why?>" where a persona is a fictional stakeholder (e.g. user). A persona may include a name, picture; characteristics, behaviours, attitudes, and a goal which the product should help them achieve.

Example:

“As provider of the Climate gateway I want to empower researchers from academia to interact with datasets stored in the Climate Catalogue, and bring their own applications to analyse this data on remote cloud servers offered via EGI.”


The Marine community produces diverse types of data (typically time-series data). They wish to store those data in files and make these files easily browsable and accessible by researchers. To maximise ease of use the files should be made available to users via a Dropbox-like system that makes relevant data files visible for each user in his/her ‘personal folder’. The users should be able to define patterns that define what kind of data they are interested in (location, time period, provider network, etc.) and the system should perform pattern matching to decide whether or not to make a particular incoming file (or set of files) visible for a given user. Such pattern matching can be CPU-intensive when we scale up to many users, many files files with complex data records. Depending on the community the source of data can be a single instrument (site), or can be multiple collection/production sites. In the latter case the data originating from multiple locations should be brought onto common formats and must be described with metadata in a coherent fashion.

The Marine CC is testing (See Figure below)

  • a combination of B2Find, B2Safe and B2Stage for the data management part (storage and transfer)
  • a Jupyter, B2Access, EGI Cloud combination for user exposure. (data subscription and access)


No.

User stories

US1

A data provider should be able to link its data production instruments into the 'back-end' of the Marine CC setup and become a data provider for the CC users.

US2

A scientists should be able to browse the connected data source networks (e.g. Euro-Argo, EMSO, SeaDataNet, etc.) and define preferences for the data records he/she is interested in. The system should make matching records visible in his/her personal access folder.

US3

A user should be able to access his/her personal data access folder via a Jupyter system and perform data analytics on the data.


Use cases

Instruction

A use case is a list of actions or event steps typically defining the interactions between a role (known in the Unified Modeling Language as an actor) and a system to achieve a goal.

Include in this section any diagrams that could facilitate the understanding of the use cases and their relationships.


Step

Description of action

Dependency on 3rd party services (EOSC-hub or other)

UC1

  • Data discovery and subsetting-subscription service on Argo observations.


UC2

  • DIVA data-interpolating variational analysis on Argo floats oxygen data, running on a Jupyter notebook.

...

  • Data scientist manages his workspace within JupyterHub : save and share notebooks, run codes on the datasets pushed by Resarch Infrastructures on EOSC (such as Argo) and his individual datasets.


Architecture & EOSC-hub technologies considered/assessed


B2SAFE: synchronize every day Argo data from Ifremer to B2SAFE          


B2DROP: as an input for data scientists individual datasets


B2ACCESS: the user (data scientist) identification service


JupyterHub: the data analytics platform on datasets (Example: DIVA analysis on a Jupyter Notebook reading Argo data)


Data subscription web GUI and API to query


  • Cassandra: the nosql data base for high performance query on data
  • Elasticsearch: the for high performance queries on metadata



Requirements for EOSC-hub

Technical Requirements


Requirement ID

EOSC-hub service

GAP (Yes/No) + description

Requirement description

Source Use Case

Example

EOSC-hub AAI

Yes: EOSC-hub AAI doesn’t support the Marine IdP

EOSC-hub AAI should accept Marine IDs

UC1

RQ1

<Gap service>

Yes: ….



RQ2

Cloud Compute

No

Create VMs ia a gateway

UC2


Capacity Requirements


EOSC-hub services

Amount of requested resources

Time period












Validation plan

....

  • 레이블 없음