페이지 트리

버전 비교

  • 이 줄이 추가되었습니다.
  • 이 줄이 삭제되었습니다.
  • 서식이 변경되었습니다.

Context of

processing

community support in SIOS

Summary of SIOS requirements for

processing

community support

발췌

Detailed requirements

1.Data processing desiderata: input

 i.   What data are to be processed? What are their:

  • Typologies
  • Volume
  • Velocity
  • Variety

 [can’t answer

ii.   How is the data made available to the analytics phase? By file, by web (stream/protocol), etc.

 [can’t answer]

iii.  Please provide concrete examples of data.

 [can’t answer]

2. Data processing desiderata: analytics

i.   Computing needs quantification:

  • How many processes do you need to execute?
  • How much time does each process take/should take?

 [can’t answer]

 ii.  Process implementation:

  • What do you use in terms of:

○      Programming languages?

○      Platform?

○      Specific software requirements?

[can’t answer]

  • What standards need to be supported (e.g. WPS) for each of the above?

[Answers here]

  • Is there a possibility to inject proprietary/user defined algorithms/processes for each of the above?

[Answers here]

  • Do you use a sandbox to test and tune the algorithm/process for each of the above?

[Answers here]

 iii.  Do you use batch or interactive processing?

[Answers here]

iv.  Do you use a monitoring console?

[Answers here]

v.   Do you use or black box or workflow processing?

  • Do you reuse sub-processes across processes?

[Answers here]

vi.   Please provide concrete examples of processes to be supported/currently in use;

[Answers here]

3.     Data processing desiderata: output

i.   What data are produced? Please provide:

  • Typologies
  • Volume
  • Velocity
  • Variety

[Answers here]

ii.   How are analytics outcomes made available?

[Answers here]

4.     Statistical questions

i.    Is the data collected with a distinct question/hypothesis in mind? Or is simply something being measured?

[Answers here]

ii.   Will questions/hypotheses be generated or refined (broadened or narrowed in scope) after the data has been collected? (N.B. Such activity would not be good statistical practice)

[Answers here]

5. Statistical data

i.   Does the question involve analysing the responses of a single set of data (univariate) to other predictor variables or are there multiple response data (bi or multivariate data)?

[Answers here]

ii.   Is the data continuous or discrete?

[Answers here]

iii.  Is the data bounded in some form (i.e. what is the possible range of the data)?

iv.  Typically how many datums approximately are there?

[Answers here]

6.  Statistical data analysis

i.   Is it desired to work within a statistics or data mining paradigm? (N.B. the two can and indeed should overlap!)

[Answers here]

ii.   Is it desired that there is some sort of outlier/anomaly assessment?

[Answers here]

iii.   Are you interested in a statistical approach which rejects null hypotheses (frequentist) or generates probable belief in a hypothesis (Bayesian approach) or do you have no real preference?

 [Answers hereB.1      Community support

We define a Community Support as a subsystem concerned with managing, controlling and tracking users' activities within an RI and with supporting all users to conduct their roles in their communities. It includes many miscellaneous aspects of RI operations, including for example (non-exhaustively) authentication, authorization and accounting, the use of virtual organizations, training and helpdesk activities.

B.1.1.   Requirements for the Community Support Subsystem:

 i.    How many communities do you support: users, developers or others? These communities may require different support mechanisms.

This was never checked.

ii.    What are the required functionalities of your Community Support Subsystem?

[can’t answer ]                                               

iii.  What are the non-functional requirements, e.g., privacy, licensing, performance?

[can’t answer  ]                                         

iv.  What standards do you use, e.g., related to data, metadata, web services?

[ISO 19115  ISO 19139]

v.   What community software/services/applications do you use?

Can’t answer]                                                

B.1.2.   Training Requirements

i.    What is your community training plan?

SIOS communities is very divers, many organisation has their own training activities. Can be for students or scientists. An example: The University Centre in Svalbard (UNIS) has its own high quality training program for new students related to field security, i.e. how to operate safe and in accordance with environmental regulations, for all students and scientists.

 

ii.   Does your community consider adopting e-Infrastructure solutions (e.g., Cloud, Grid, HPC, cluster computing).

WP15 can develop and deliver training about methodologies, infrastructures, tools and services for those who want to build environments for big data. The content and training can benefit those tool developers who want to create new environments for scientists, for the users of ENVRIPLUS Research Infrastructures. The topics covered in this area include:

  • 􏰀 Methodologies, tools and e-infrastructures for high-throughput, high-performance and cloud computing
  • Application porting and integration approaches to clouds and grids
  • Approaches, tools and online services for data storage, organisation, transfer and processing
  • Workflows and pipelines - organising and sharing multi-stage simulations at community level
  • Scientific gateways - integrate applications, data and services into web-based portals
  • Developing PaaS systems for application developers
  • Developing SaaS systems for scientific end users

o WP15 plans to develop and deliver training about building e-infrastructures, federated infrastructures for scientific communities. The content and training can benefit the IT operators of RIs, those who need to build and operate IT infrastructures to support environmental sciences data, applications, tools and environments. The topics covered in this area include:

  • 􏰀 Deploying clusters and desktop infrastructures for high-throughput or high-performance computing
  • Deploying virtualisation and hypervisor technologies to build IaaS clouds
  • Federating cloud systems into multi-organisational and multi-national Virtual Organisations (including connecting clouds to monitoring, accounting, user management and resource allocation systems)

o Facilitate harmonisation of e-infrastructure training content and events among the European e-infrastructures to maximise benefits for ENVRIPLUS RIs.

[It is possible]                                            

iii.  Is your community interested in training courses that introduce state-of-the-art e-Infrastructure technology?

[ probably yes ]

Formalities (who & when)

Go-between@Yin Chen
RI representativeJon Borre Orbek, Angelo Viola, Vito Vitale
Period of requirements collectionAug 2015- Jan 2016
Status