Return to ENVRI Community Home![]()
ENVRI
Services for the Environmental Community
Analysis of Common Requirements
For ENVRI Research Infrastructures
|
Document identifier: |
D3.3 Analysis of Common Requirements for ENVRI Research Infrastructures |
|
Date: |
30/04/2013 |
|
Activity: |
WP3 |
|
Lead Partner: |
CU |
|
Document Status: |
FINAL |
|
Dissemination Level: |
PUBLIC |
|
Document Link: |
<link to the website> |
|
ABSTRACT
The objectives of ENVRI WP3 task T3.2 is to examine the design of the 6 ESFRI environmental Research Infrastructures (RIs), (ICOS, EURO-Argo, EISCAT-3D, LifeWatch, EPOS, and EMSO,) in order to identify common computational characteristics of them, and to develop an understanding of the specific requirement through observations.
Throughout the study, a standard model, the Open Distributed Processing (ODP) is chosen to use to interpret the design of the research infrastructures, and place their requirements into the ODP framework for analysing. The document reports the initial results from this study. Briefly, from the aspect of the ODP Engineering Viewpoint, the architectural characteristics of the RIs have been examined, and 5 common sub-systems have been identified: sub-systems of data acquisition , curation , access , processing and community support . Secondly, from the aspect of the ODP Computational Viewpoint, we looked at each of the 6 RIs in details and identified the common functions and embedded computations they provided. Matrices has been used for comparison. Definitions of functionalities have been provided. Finally, from the aspect of the ODP enterprise viewpoint, we have identified 4 common communities, and derived the community roles.
The contribution of this work to the environmental science research infrastructures is threefold:
|
Copyright © Members of the ENVRI Collaboration, 2011. See www.ENVRI.eu for details of the ENVRI project and the collaboration. ENVRI (“ Common Operations of Environmental Research Infrastructures ”) is a project co-funded by the European Commission as a Coordination and Support Action within the 7th Framework Programme. ENVRI began in October 2011 and will run for 3 years. This work is licensed under the Creative Commons Attribution-Noncommercial 3.0 License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, and USA. The work must be attributed by attaching the following reference to the copied elements: “Copyright © Members of the ENVRI Collaboration, 2011. See www.ENVRI.eu for details of the ENVRI project and the collaboration”. Using this document in a way and/or for purposes not foreseen in the license, requires the prior written permission of the copyright holders. The information contained in this document represents the views of the copyright holders as of the date such views are published.
|
|
Name |
Partner/Activity |
Date |
|
From |
|
|
|
|
Reviewed by |
Moderator: Reviewers: |
|
|
|
Approved by |
|
|
|
|
Issue |
Date |
Comment |
Author/Partner |
|
1.0 |
17/02/13 |
First draft for internal review |
Yin Chen (CU) Alex Hardisty (CU) Alun Preece (CU) Paul Martine (UEDIN) Malcolm Atkinson (UEDIN) Herbert Schentz (EAA) Barbara Magagna(EAA) Zhiming Zhao(UoV) |
|
2.0 |
30/04/13 |
Internally reviewed version to be approved by project management and submitted to the Commission. |
Yin Chen (CU) |
|
3.0 |
|
|
|
|
4.0 |
|
|
|
|
5.0 |
|
|
|
|
6.0 |
|
|
|
This document is a formal deliverable for the European Commission, applicable to all members of the ENVRI project, beneficiaries and Joint Research Unit members, as well as its collaborating projects.
Amendments, comments and suggestions should be sent to the authors.
A complete project glossary is provided at the following page: http://www.ENVRI.eu/glossary.
Frontier environmental research increasingly depends on a wide range of data and advanced capabilities to process and analyse them. The ENVRI project, “Common Operations of Environmental Research infrastructures”, is a collaboration in the ESFRI Environment Cluster, with support from ICT experts, to develop common e-science components and services for their facilities. The results will speed up the construction of these infrastructures and will allow scientists to use the data and software from each facility to enable multi-disciplinary science.
The target is on developing common capabilities including software and services of the environmental e-infrastructure communities. While the ENVRI infrastructures are very diverse, they face common challenges including data capture from distributed sensors, metadata standardisation, management of high volume data, workflow execution and data visualisation. The common standards, deployable services and tools developed will be adopted by each infrastructure as it progresses through its construction phase.
Two use cases, led by the most mature infrastructures, will focus the development work on separate requirements and solutions for data pre-processing of primary data and post-processing toward publishing.
The project will be based on a common reference model created by capturing the semantic resources of each ESFRI-ENV infrastructure. This model and the development driven by the test-bed deployments result in ready-to-use systems which can be integrated into the environmental research infrastructures.
The project puts emphasis on synergy between advanced developments, not only among the infrastructure facilities, but also with ICT providers and related e-science initiatives. These links will facilitate system deployment and the training of future researchers, and ensure that the inter-disciplinary capabilities established here remain sustainable beyond the lifetime of the project.
This report presents the initial finding from the study T3.2, analysis of the requirements of data processing. The Open Distributed Processing (ODP) is used as the framework for the analysis. F rom the aspect of the ODP engineering viewpoint , the physical structuring mechanism for the 6 ENVRI research infrastructures are analysed and 5 common sub-systems are identified, data acquisition , data curation , data access , data processing , and user community support . Secondly, f rom the aspect of the computational viewpoint , a set of operations and embedded computations commonly provided by the infrastructures are identified. Finally, from the aspect of the ODP enterprise viewpoint , 4 common communities are identified: data acquisition , data management , data service provision , and data user . The roles , behaviours , and policies for each community are described.
TABLE OF CONTENTS
1 Introduction .........................................................................................................
2 Common Architectureal Characteristics ......................................
3 Common Functions and Operations .................................................
3.1 Analysis of EISCAT-3D
3.2 Analysis of Euro-Argo ......................................................................................
3.3 Analysis of ICOS ................................................................................................
3.4 Analysis of EMSO ..............................................................................................
3.5 Analysis of EPOS ...............................................................................................
3.6 Analysis of LifeWatch ......................................................................................
3.7 The Common Functions and Embedded Computations ........................
4 Common Communities ................................................................................
4.1 Common Communities ....................................................................................
4.2 Common Community Roles ...........................................................................
5 Conclusion .............................................................................................................
6 Acknowledgements ........................................................................................
7 References ............................................................................................................
The objective of this study is to analyse and develop an understanding of the specific requirements of each ESFRI Environmental (ENV) Research Infrastructure (RI) with respect to common short-term priorities. The ENVRI background papers and T3.1 Assessment of the State of the Art provide useful surveys and evaluations of ENV RIs; common requirements emerge. This study intends to make a further step towards a common model for ENVRI. The study will not produce a common model -- however, it will serve as an input to such a model.
In this study, we use a standard approach, Open Distributed Processing (ODP), to interpret the design of 6 representative environmental research infrastructures (ICOS [1] , EPOS [2] , EMSO [3] , EISCAT-3D [4] , LifeWatch [5] , and Euro-Argo [6] ), and place their requirements into the ODP framework for further analysis. ODP is an ISO/IEC standard [ 1-4 ] , which provides an overall conceptual framework for building distributed system. It defines five specific viewpoints which are abstractions that yield specifications of the whole system related to particular sets of concerns. The five viewpoints are [ 5 ] ;
The added value of using ODP to analyse the requirements for ENVRI is threefold:
The analysis presented in this reports are based on partial knowledge of a snapshot of current state of ENV RIs. This is because t he documentation of RIs is often incomplete and inconsistent , and the designs evolve over time and subject to change. For example, the EISCAT-3D design study finished in 2009 and submitted the final design as Deliverable 11.1 to the EU commission. Its succeeding project, the EISCAT-3D Preparatory Phase (EISCAT-3D PP), started in 2010, examines the feasibility of the design and prepares for implementation starting in 2014. During the EISCAT-3D PP, many parts of the design are likely to be re-evaluated and re-designed, e.g., due to infeasibility for implementation. This is a common issue for most if not all other RIs. The investigation of ENVRI should be based on the existing knowledge of the design provided by RIs, meanwhile, keep up to date with the development of any new activities.
There is one key issue of how to denote in the ODP context "known unknowns". We should tolerate schemas/descriptions with many of these at first, and progressively push the unknowns towards detail or boundaries later.
The rest of the report presents initial findings from the study. The analysis covers 3 ODP viewpoints, Enterprise , Computational , and Engineering . Because most of the ENV RIs provide documentation which describes the architectural features of their infrastructure, it is straightforward to start with the Engineering Viewpoint , where we identify the common physical structuring mechanism for the system infrastructures. Secondly, we identify common functions and embedded computations provided by the ENV RIs. This, in essence, is to analyse the RIs from the aspect of the ODP Computational Viewpoint . Finally, we look at the real-world systems from the aspect of the ODP Enterprise Viewpoint , and identify the common communities , and their roles . Matrices are used to visualise the results of comparison, where columns are the names of ENV RIs and rows are ODP elements.
In ODP, the purpose of the Engineering Viewpoint is to identify and specify the structuring mechanisms for distributed interactions and the functional elements. It concerns the architectural features of an infrastructure.
The structures of the studied RIs can be divided into sub-systems based on functions and locations of computational elements. For the purposes of this document, each sub-system is defined as a set of capabilities that collectively are defined by a set of interfaces with corresponding operations that can be invoked by other sub-systems . An interface in ODP is an abstraction of the behaviour of an object that consists of a subset of the interactions of that object together with a set of constraints on when they may occur. Sub-systems are disjoint from each other.
Five common sub-systems are identified: data acquisition , data curation , data access , data processing , and community support . The order of these sub-systems is irrelevant.
The data acquisition sub-system collects raw data from sensor arrays, various instruments, or human observers, and brings the measures (data streams) into the system. Note, ENVRI is concerned with the computational aspects of an infrastructure, thus, by definition, the data acquisition sub-system starts from the point of sensor signals being converted into digital values and received by the system. There are many related activities including, defining data acquisition protocols, design and deployment of the sensor instruments, and configuration and calibration devices, which are crucial tasks for data acquisition nevertheless beyond the scope of the ENVRI investigation. The data acquisition sub-system is typically operated at observatories or stations. Data in the acquisition sub-system are normally non-reproducible, the so-called raw data or primary data. Consistent time-stamps are assigned to each data object. There are the cases that the raw data may be generated by a simulation model, in which situation, the raw data may be reproducible, in terms of being regenerated. The (real-time) data streams sometimes are temporarily stored (e.g., in computer clusters), then, sampled, filtered or processed (e.g., based on applied quality control criteria). Control software is often provided to allow the execution and monitoring of data flows. The data collected at the data acquisition sub-system are transmitted to the data curation sub-system , to be maintained and archived there.
The data curation sub-system facilitates quality control and preservation of scientific data. It is typically operated at a data centre. Data handled at the curation sub-system are often reproducible in term of being able to be re-processed. Operations such as data quality verification, data identification, annotation, cataloguing, and long-term preservation are often provided. Various data products are generated and provided for users which need to be accessed through data access sub-system . There is usually an emphasis on non-functional requirements for a data curation sub-system including the need for satisfying performance criteria in availability, reliability, utility, throughput, responsiveness, security and scalability.
The data access sub-system enables discovery and retrieval of data housed in data resources managed by a data curation sub-system . Data access sub-systems often provide facilities such as data portals, as well as services to present or deliver the data products. Search facilities including both query-based and navigation-based searching tools are provided which allow users or services to discover interesting data products. Discoveries based on metadata or semantic linkages are most common. Data handled at the access sub-system can be either structurally and semantically homogeneous or heterogeneous. When supporting heterogeneous data, different types of data (often pulled from a variety of distributed data resources) may be converted into uniform representations with uniform semantics which can be resolved by a data discovery and access service. Services allowing harvesting of metadata and/or data, as well as services enhancing the performance by compression and packaging methods and encoding services for secure data transfer are often part of the data access sub-system . Data access can be open or controlled (e.g., enforced by authentication and authorisation policies). It is notable that a data access sub-system usually does not provide "write" operations for end users, although such operations may be provided for an administrator of a data resource.
The data processing sub-system aggregates the data from various resources and provides computational capabilities and capacities for conducting data analysis and scientific experiments. Data handled by the data processing sub-system are typically derived and recombined via the data access sub-system . A data processing sub-system normally offers operations for statistical and/or mining functions for analysis, facilities for conducting scientific experiments, modelling/simulation, and scientific visualisation. Performance requirements for processing scientific data tend to be concerned more about scalability issue, which may also be necessary to address at the infrastructure level -- for example, to make use of the Grid or Cloud technology. In this case, functionalities to interact with the physical infrastructure should be provided.
Finally, the community support sub-system manages, controls and tracks users' activities and supports users to conduct their roles in communities. Data handled by a community support sub-system typically are user generated data, control and communications. A community support sub-system normally supports for interactive visualisations, Authentication, Authorisation and Accounting (AAA), as well as for managing virtual organisations. The community support is orthogonal to and cross-cutting the other 4 sub-systems .
There may be other ways to group the functional elements. Above provides one possible solution. The main purpose for the classification is to identify the common structural characteristics of the environmental research infrastructures. As shown in Figure 2.1 below, the five sub-systems map well to the architectures of the RIs studied.
As shown in Table 2.1 , different RIs emphasise the design and implementation of different sub-systems . By the time of writing this report, RIs such as ICOS, EISCAT-3D, Euro-Argo, and EMSO mainly focus on data acquisition , curation and access . They are typical large-scale observatory systems . Some others RIs, such as EPOS and LifeWatch, are built on existing systems having limited control over data resources, and focus more on data access and processing . They are comprehensive integration infrastructures for domain data and computations. It worthy of mentioning that generic computational RIs, such as EUDAT and EGI, are general purpose large-scale infrastructures for data management and processing; EUDAT tends to focus more on the functionalities related to the data curation sub-system , and EGI tends to focus more on the functionalities related to the data processing sub-system . Both EUDAT and EGI provide generic operations and services which can be used in various domains of research within either infrastructure.
Figure 2.1: Common Sub-Systems
Table 2.1: The Correlations Between the Design and Implementation Emphasis of ENV RIs and the Five Common Sub-systems
|
Sub-system |
EISCAT-3D |
Euro-Argo |
ICOS |
EMSO |
EPOS |
LifeWatch |
EUDAT |
EGI |
|
Acquisition |
Yes |
Yes |
Yes |
Yes |
|
|
|
|
|
Curation |
Yes |
Yes |
Yes |
Yes |
|
|
Yes |
|
|
Access |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
|
Processing |
|
|
|
|
Yes |
Yes |
|
Yes |
|
Community Support |
|
|
|
|
|
Partial |
Partial |
Partial |
For the same sub-system , different RIs provides different facilities. In the following, we examine each RI, and identify the common computational functions in each sub-system .
The ODP Computational Viewpoint focuses on the functionality of an infrastructure, and the service it offers.
Dividing the structures of RIs into sub-systems helps to break down the complexity in analysis. Within each sub-system , we use a data-oriented approach, which follows the life-cycle of data -- e.g., creation, transmission, transformation, modification, processing, and visualisation -- to identify key functions and embedded computations.
The analysis is based on the materials or information from the following sources:
The objective of EISCAT-3D 4 is to design and construct a new-generation incoherent-scatter research radar which provides a long-term upper atmospheric science capability for studies of the atmosphere and near-Earth space.
The system design of EISCAT-3D explored many different areas, including the construction of antennas, arrays, the signal processing, the network and the data distribution system. The investigation of ENVRI scopes into data acquisition, processing and archiving aspects of EISCAT-3D.
The design of EISCAT-3D data archiving and distribution can be summarised as follows: there is a two-stage system for handling data. The beam-formed sample-level data, together with data from the interferometry system, and some high-volume data from supporting instruments, are streamed to a large ring buffer designed to hold up to a few days, after which these low-level data will be over-written. The ring buffer allows the low-level data to be stored for long enough to allow it to be optimally processed, in terms of subsequent auto-correlation and integration in time and range. (The latency time of the buffer must therefore be long enough to allow multiple processing strategies to be applied before the low-level data are over-written.) The final optimally-derived data products, (which are typically at least an order of magnitude smaller,) are then transferred to the permanent data archive. At the same time, a second copy of the incoherent scatter data is separately passed through a default signal processing strategy in order to produce the quick-look data needed for control of experiments [ 6 ] . EISCAT-3D will provide visualisation means to present its data products and system status.
We consider a group of functions that support EISCAT-3D to collect raw data as a data acquisition sub-system ; a group of functions that support of data storing and archiving as a data curation sub-system; a group of function that support data discovery and deliver to end users as a data access sub-system .
In the data acquisition sub-system , EISCAT-3D collects 3 types of data [ 6 ] : 1) incoherent scatter data; 2) interferometric data, and 3) data from supporting instruments (which are not among EISCAT-3D main data products). We analyse the life-cycles of incoherent scatter data and the interferometric data as follows:
[ 7 ] provides full list of functions provided at this sub-system . The main request is a control software to provide the following computational functions:
Here, the names of the functions are given as the abstractions of the requirements described in the EISCAT-3D design documents (or related information materials). The original requirements (from the EISCAT-3D design documents) are used as examples to illustrate the meanings of the functions. The same principle applies to the rest of the analysis. A more formal definitions of the functions will be provided in sub-section 3.7.
In the data curation sub-system , EISCAT-3D will archive data delivered from the ring-buffer for the long term. The Data Preservation and Distribution component of the overall EISCAT-3D system is the one that essentially acts as an ingest facility, with provision for adaptive storage, data location management and the ability to reprocess data within certain time limitations. The data handled at this sub-system at typical time-associated data in file formats. The key functions include:
EISCAT-3D designs for the following functions for the data access sub-system :
EISCAT-3D will provide facilities for data processing . The focus is on visualisation and the functions to be provided include [ 9 ] :
To support its community, EISCAT-3D will provide the following functionality:
EISCAT-3D introduces significant challenges in data handling that how to cope with large-scale data in real-time. The requirements are documented in detail and solutions are proposed. However, the system is not yet implemented, and there are many uncertainties in realisation. In the next, we will look at a more mature system, EURO-Argo.
Argo is a global ocean observing system comprising of a large network of robotic floats distributed across the world's oceans and supporting infrastructure. It is a unique system to monitor heat and salt transport and storage, ocean circulation and global overturning changes and to understand the ability of the ocean to absorb excess carbon dioxide from the atmosphere.
EURO-Argo 6 is the European contribution to Argo as an European Infrastructure. The objectives of the new Euro-Argo Research Infrastructure include 1) to provide, deploy and operate an array of around 800 floats contributing to the global array (a European contribution of ¼ of the global array); 2) to provide enhanced coverage in the European regional seas; and 3) to provide quality controlled data and access to the data sets and data products to the research (climate and oceanography) and operational oceanography (e.g. GMES Marine Core Service) communities. [ 10 ]
The robotic floats are operated as follows [ 11 ] : after being released, floats dive to a programmable depth (currently 1000 metres), drifting freely in currents. Every 10 days, a float dives to 2000 metres, then rises to the surface to send data by satellite link. More than 200 cycles can be performed during the float's 4 year lifespan. The data collected by Argo include heat, salt transport/storage, ocean circulation and global overturning changes in order to understand (amongst other things) the ocean's absorption of excess carbon dioxide.
The life-cycle of Argo data are as follows [ 12 ] : the 11 national Data Assembly Centres (DACs) receives data from satellite operators, decode and perform quality control (according to a set of 19 real-time automatic tests). Erroneous data are flagged, corrected if possible and then passed to the 2 Global Data Assembly Centres (GDAC), and to the World Meteorological Office Global Telecommunication System (GTS). The 2 GDAC located at Coriolis (France) and USGODAE (USA) collect data from the 11 DACs and provide a unique access both in real time (within 24-48hrs after transmission) and delayed mode (6-12 months after transmission). Data available in NetCDF format in FTP and internet. The 2 GDACs synchronise every day. GDACs also deliver data to several Argo Regional Centres (ARCs), where the expertise on specific geographical ocean regions will provide comprehensive data sets (including non-Argo data). Data from GDACs will be long-term archived at data centre located in NODC (US).
The architecture of Euro-Argo is depicted in Figure 2 .1 (D). We consider a group of functions that supports the 11 DACs to collect the raw data from the floats and standardise the collection process as a data acquisition sub-system ; a group of functions that supports the 2 GDACs to check the data quality and to archive the data as a data curation sub-system ; and a group of functions that support data distribution and access as a data access sub-system . EURO-Argo provides limited functions for data processing and community support . It links with external systems, such as MyOcean [7] (an ocean monitoring and forecasting system which provides products and services for all marine applications) and the SeaDataNet [8] (a Pan-European Infrastructure for Ocean & Marine Data Management), to provide such functionalities.
In the data acquisition sub-system , EURO-Argo includes the following functions [ 11 ] :
In the data curation sub-system , the 2 GDACs keep the master copies of the Argo global dataset (metadata, profiles, trajectories and technical information). The data are long-term archived at the NODC centre in US. The key functions provided include:
In the data access sub-system , EURO-Argo provides the following functions:
The EURO-Argo system is relatively mature and capable of supporting the whole life-cycle of Argo data from acquisition to preservation. However, only necessary and basic operations are provided. Many processes have not yet been automated and comprehensive functionalities, in particular for data access and data processing, have not yet been considered. Compared to EURO-Argo, ICOS offers more implementation experiences in such functions.
ICOS 1 , the Integrated Carbon Observing System, is a world-class research infrastructure to quantify and understand greenhouse gas fluxes. The objectives of ICOS community are to monitor greenhouse gases (GHG) over the long term through atmospheric, ecosystem and ocean networks.
As shown in Figure 2.1 (A), the ICOS distributed infrastructure consists of the following elements [ 13 ] :
We consider a group of computational functions that facilitates the collections of the observations of greenhouse gases (GHG) from the hundred plus stations of ICOS atmospheric, ecosystem and ocean networks as a data acquisition sub-system . The design [ 13 ] describes the requirements for a control software which offers the following functions:
We consider a group functions that support the ICOS three thematic centres, ATC, ETC and OTC, to receive observations from stations, check data quality and to archive data as the data curation sub-system . ICOS designed or implemented the following functions in this sub-system:
A group of functions which supports the publication and access of ICOS data products is considered as a data access sub-system . ICOS designs for a Carbon Portal to distribute its data products. For example, the ATC has implemented the following functions in its web portal [9] :
The Carbon Portal will also support discovery and integration of the data from its 3 thematic centres, and the following functionalities are planned [ 13 ] :
We consider a group of computational functions that supports analysing and mining of ICOS data as a data processing sub-system . ICOS will provide the following functions through its Carbon Portal [ 13 ] :
ICOS designs the following tools/functions to be provided by the Carbon Portal to support the ICOS user community. This group of functions can be considered as a community support sub-system .
Besides the atmospheric thematic centre, ICOS is constructing the similar information systems for the other two thematic centres. However, in a long-term, ICOS encounters the challenges of aggregating and integrating data across the 3 thematic centres and to conduct scientific analysis and experiments upon the integrated data. In such areas, EMSO is a step ahead.
EMSO 3 , European Multidisciplinary Seafloor Observatory, is a European network of sea floor observatories for the long-term monitoring of environmental processes related to ecosystems, climate change and geo-hazards. The objectives of EMSO community are to ensure the technological and scientific framework for the investigation of the environmental processes related to the interaction between the geosphere, biosphere, and hydrosphere and for a sustainable management by long-term monitoring also with real-time data transmission.
EMSO observatories will include a common set of sensors for basic measurements and further sensors for specific purposes defined by users. The common set of instruments comprises seismometers, hydrophones for geophysics, magnetometers, gravity meters, CTD (Conductivity, Temperature, and Depth), current meters, chemical sensors, pressure sensors, and hydrophones for bio-acoustic monitoring. Additionally, laboratory studies are performed on material collected at these sites by sampling devices (e.g., water samplers, sediment cores, traps etc.). The following activities are carried out at EMSO individual observatories [ 15 ] . They are likely supported by computational facilities in order to:
EMSO data collected in experiments at 11 regional sites are locally stored and organized in catalogues or relational database and run by the institutions involved. Some of EMSO observatories' data from some distributed sites are harvested and long term archived at 3 data archives, Ifremer(EUROSITES [10] ), UniHB(PANGAEA [11] ) and INGV(MOIST [12] ). A central archive hosting a web-service access to all the databases is planned for the near future. We consider a group of functions provided by MOIST, PANGEA and EUROSITES that support data quality control and preservation as a data curation sub-system . Key functions include but are not limited to:
The PANGAEA data library and publisher retrieves data from EMSO resources and make them publically accessible. We consider a group of functions that facilitates the publication and access of EMSO data as a data access sub-system . This sub-system includes the following functions:
MOIST provides the following tools/software:
We consider a group of functions that support EMSO users to conducts various tasks as a community support sub-system . We identified the following functions:
EMSO provides advanced technology in data publication and citation through the PANGAEA system. EMSO also offers capabilities for data access, standardisation/harmonisation and visualisation via MOIST data infrastructure. Presently (in Dec. 2012), 3 regional sites data are integrated in MOIST, and one regional site is integrated in PANGAEA which additionally offers data from several related or preparatory studies for other EMSO sites. In addition, Ifremer offers access to data from all EUROSITES sites which are shared with EMSO. EMSO has integrated all its operational sites within a common data portal. In the next step, EMSO plans to continue to harmonize its vocabularies and terminologies according to SEADATANET standards and aims to offer access to data via a common NetCDF format which is compliant with SEADATANET. Further EMSO plans to improve standardised access to real time data via SOS.
In the next, we look at EPOS, which has special emphasis on the integration and interoperability problem and tackles the problem by a new infrastructure design.
EPOS 2 , the European Plate Observing System, is a research infrastructure and e-Science for data and observatories on earthquakes, volcanoes, surface dynamics and tectonics. The objectives of EPOS community are to integrate the existing research infrastructures (RIs) in solid Earth science in order to increase the accessibility and usability of multidisciplinary data from seismic and geodetic monitoring networks, volcano observatories, laboratory experiments and computational simulations. EPOS aims to enhance worldwide interoperability in Earth Science by establishing a leading integrated European infrastructure and services.
Since only the seismic network of EPOS is relatively mature when writing this report in Dec. 2012, the following analysis is limited to the requirements of this discipline.
EPOS focuses on integration and interoperability of existing earth science systems. It does not actively design or implement functionalities for data acquisition and curation . Such functionalities are already available in the existing systems. For example, the real-time seismic waveform data from more than 500 broadband stations in Europe are collected by the Virtual European Broadband Seismograph Network (VEBSN), using seismic data acquisition systems such as, Antelope, SeiscomP/SeedLink, and SCREAM [ 16 ] . A number of data centres, such as ORFEUS and EMSC, respond to data quality control and archiving. Data are archived using archive protocols (e.g., ArcLink and mseed2dmc). All data is openly available to the research community through a variety of means, such as web services, direct access and interactive tools. In the long term, the data will be preserved via EUDAT nodes using grid data technology such as iRODS, which store and replicate the data, providing also unique and persistent ID (PID) to data granules through a federated handle systems [ 17 ] .
We consider a group of computational functions provided by VEBSN to support data collection as a data acquisition sub-system . The key functions include:
We consider a group of computational functions provided by ORFEUS to support data quality control and archiving as a data curation sub-system . The key functions include:
Non-functional requirements emphasise on performance aspects including, security, consistency, productivity, responsibility, reliability, accessibility, availability, scalability, and load-balance.
EIDA [ 21 ] serves in the EPOS data infrastructure as a consortium of waveform data centres that share a common agreement on issues related to data formats, metadata, transfer protocols and interfaces within the consortium. We consider a group of functions provided by the EIDA data centre which supports of data exchange and discovery as a data access sub-system . The technical architecture consists of the ArcLink middleware, which is installed at each node of the consortium. Each node synchronises its network, station, location, channel metadata everyday [ 21 ] . On top of ArcLink each node has built its own infrastructure to exchange the waveform data within the consortium through the TCP/IP protocol [ 21 ] . This presents peer-to-peer communication. Federated security is being planned for each individual institution within EPOS, so that each institute can maintain its own security infrastructure, but a single sign on process is desired, probably making use of some combination of X509 certificates, Shibboleth and LDAP in order to make an apparently seamless AAI (Authentication and Authorisation Infrastructure). [ 17 ]
To summarise, the functions and embedded computations provided by the data access sub-system include:
For data processing , EPOS data centres, such as ORFEUS, have established long lasting tradition for data analysis and mining. ORFEUS maintains a repository [ 23 ] of software/tools for specific interest to the seismological community with emphasis on free software. The required functions for data processing mainly include the following areas:
The functionality provided by EPOS to support its community, which can be considered as a community support sub-system . Such functionality includes but is not limited to:
EPOS designs for a new-generation earth science system by applying the most advanced e-Science technologies on existing well-developed (seismic and other earth science) systems. However, the project is still in its early stages and design work is not yet completed. In the next section, we will look at LifeWatch, which addresses similar challenges to EPOS in many areas, and has provided some solutions through its own design study.
LifeWatch 5 is an e-science and technology Infrastructure for biodiversity and ecosystem research to support the scientific community and other users in the public, commercial, and policy sectors. The main objective of LifeWatch is to put in place the essential infrastructure and information systems necessary to provide an analytical platform for the use of both existing and new data on biodiversity. Different from an observatory system, such as EISCAT-3D or EURO-Argo, LifeWatch is an comprehensive integration infrastructure for domain-specific scientific data and computation. The emphasis is on a network of services providing secure access across multiple organisations to biodiversity and related data and to relevant analytical and modelling tools to collaborative groups of researchers [ 25 ] .
The guidelines for the specification and implementation of the LifeWatch ICT infrastructure is given by the LifeWatch Reference Model [ 25 ] , which is built on the ORCHESTRA Reference Model, an architectural framework for distributed processing and geospatial computing, which itself is based on ODP.
The Lifewathc Reference Model describes the LifeWatch architecture which consists of 4 function domains. As shown in Figure 2.1 (B) , they the Resource Layer, the Infrastructure Layer, the Composition Layer, and the User Layer [ 25 ] . The Resource Layer contains the data from sites and collections, but also contains catalogue services, analysis tools and processing resources that already exist at external networks; the Infrastructure Layer provides mechanisms for uniform access and integration of heterogeneous resources in the Resource Layer. Functional components in the LifeWatch Infrastructure Layer are implemented as services; the Composition Layer provides the tools for intelligent selection and orchestration of services, including workflows, semantic metadata for the discovery of components and the storage of additional attributes such as provenance and version information; and the User layer provides domain-specific presentation environments and tools for community collaborations, which is a generic portal with extended domain- and application- specific portlets.
With the data acquisition sub-system being absent, the 4 functional domains defined in LifeWatch are approximate to the 4 common sub-systems identified in ENVRI. The mapping is provided in Table 3.1 .
Table 3.1: LifeWatch Functional Domains via ENVRI Common Sub-Systems
|
LifeWatch Functional Domains |
ENVRI Common Sub-systems |
|
Resource |
Data Curation |
|
Infrastructure (Data Access & Discovery & Semantic Mediation) |
Data Access |
|
Infrastructure (Data Process & analysis & Modelling) + Composition |
Data Processing |
|
User |
Community Support |
[ 25 ] provides a list of services to be provided by LifeWatch. Unfortunately, the mapping between these services to the 4 LifeWatch architectural layers are missing from the document. For the purpose of analysis, we examine the specification of each service and distribute them into the appropriate sub-systems.
In the data curation sub-system , the LifeWatch consider those data, processing tools and instruments managed by multiple organisations, and in general, LifeWatch cannot dictate their location or configuration. The main function provided from within LifeWatch is a group of source integration services (Service Delegation) which provide an encapsulation of external resources to be used by the infrastructure. [ 25 ]
In the data access sub-system , LifeWatch provides the following functions for data discovery and access in particular upon the heterogeneous resources [ 25 ] :
The services provided at the data processing sub-system include [ 25 ] :
Finally, the community support sub-system provides the following functionalities [ 25 ] :
LifeWatch investigated the possibility of integrating various state-of-the-art standardised technologies to provide generic services and operations to support biodiversity research. This, on the other hand, results in high-level of abstractions of the design, which is likely to introduce difficulties in interpretation and realisation.
To summarise the above observations, Table 3.2 lists functions and embedded computations provided by the existing research infrastructures. Each function is defined as an interface which encapsulates a set of required operations or services that act upon an object . Recall the definition of an object in ODP, which is a model of a real-world entity, characterised by its behaviour and its state. The interactions that occur between those objects at their interfaces .
The value domain used in the table is defined as follows:
V ={ Yes, No, Unknown, Not Applicable, In consideration }, where
Table 3.2 : The Common Functions and Embedded Computations
|
A |
Data Acquisition Subsystem |
||||||||
|
No |
Functions |
Definitions |
ICOS |
EPOS |
EMSO |
EISCAT-3D |
LifeWatch |
EURO-Argo |
|
|
A.1 |
Instrument Integration |
An interface that provides operations to create, edit and delete a sensor. |
Yes |
Unknown |
Yes |
No |
Not Applicable |
Yes |
|
|
A.2 |
Instrument Configuration |
An interface that provides operations to set-up a sensor or a sensor network. |
Yes |
Unknown |
Yes |
Yes |
In Consideration |
Yes |
|
|
A.3 |
Instrument Calibration |
An interface that provides operations to control and record the process of aligning or testing a sensor against dependable standards or specified verification processes. |
Yes |
Unknown |
Yes |
Yes |
In Consideration |
Yes |
|
|
A.4 |
Instrument Access |
An interface that provides operations to read and/or update the state of a sensor. |
Yes |
Unknown |
Unknown |
Yes |
In Consideration |
Unknown |
|
|
A.5 |
Configuration Logging |
An interface that provides operations to collect configuration information or (run-time) messages from a sensor (or a sensor network) and output into log files or specified media which can be used by routine troubleshooting and in incident handling. |
Yes |
Unknown |
Unknown |
Unknown |
In Consideration |
Unknown |
|
|
A.6 |
Instrument Monitoring |
An interface that provides operations to check the state of a sensor or a sensor network which can be done periodically or when triggered by events. |
Yes |
Unknown |
Yes |
Yes |
In Consideration |
Yes |
|
|
A.7 |
(Parameter) Visualisation |
An interface that provide operations to output the values of parameters and measured variables a display device. |
Yes |
Unknown |
Unknown |
Yes |
Not Applicable |
Unknown |
|
|
A.8 |
(Real-Time) (Parameter/Data) Visualisation |
A specialisation of (Parameter) Visualisation which is subject to a real-time constraint. |
Unknown |
Unknown |
Unknown |
Yes |
Not Applicable |
Unknown |
|
|
A.9 |
Process Control |
An interface that provide operations to receive input status, apply a set of logic statements or control algorithms, and generate a set of analogy and digital outputs to change the logic states of devices. |
Yes |
Unknown |
Unknown |
Yes |
Not Applicable |
No |
|
|
A.10 |
Data Collection |
An interface that provides operations to obtain digital values from a sensor instrument, associating consistent timestamps and necessary metadata. |
Yes |
Yes |
Yes |
Yes |
Not Applicable |
Yes |
|
|
A.11 |
(Real-Time) Data Collection |
A specialisation of Data Collection which is subject to a real-time constraint. |
Yes |
Yes |
Unknown |
Yes |
Not Applicable |
Yes |
|
|
A.12 |
Data Sampling |
An interface that provides operations to select a subset of individuals from within a statistical population to estimate characteristics of the whole population. |
No |
Unknown |
Unknown |
Yes |
Not Applicable |
No |
|
|
A.13 |
Noise Reduction |
An interface that provides operations to remove noise from scientific data. |
Yes |
Unknown |
Unknown |
Yes |
Not Applicable |
Yes |
|
|
A.14 |
Data Transmission |
A interface that provides operations to transfer data over communication channel using specified network protocols. |
Yes |
Yes |
Yes |
Yes |
Not Applicable |
Yes |
|
|
A.15 |
(Real-Time) Data Transmission |
A specialisation of Data Transmission which handles data streams using specified real-time transport protocols. |
Yes |
Yes |
Unknown |
Yes |
Not Applicable |
Yes |
|
|
A.16 |
Data Transmission Monitoring |
An interface that provides operations to check and report the status of data transferring process against specified performance criteria. |
Yes |
Unknown |
No |
No |
Not Applicable |
No |
|
|
B |
Data Curation Sub-System |
||||||||
|
No |
Functions |
Definitions |
ICOS |
EPOS |
EMSO |
EISCAT-3D |
LifeWatch |
EURO-Argo |
|
|
B.1 |
Data Quality Checking |
An interface that provides operations to detect and correct (or remove) corrupt, inconsistent or inaccurate records from data sets. |
Yes |
Yes |
Unknown |
Yes |
Not Applicable |
Yes |
|
|
B.2 |
Data Quality Verification |
An interface that provides operations to support manual quality checking. |
Yes |
Unknown |
Unknown |
Unknown |
Not Applicable |
Yes |
|
|
B.3 |
Data Identification |
An interface that provides operations to assign (global) unique identifiers to data contents. |
Yes |
Yes |
Yes |
Unknown |
Not Applicable |
Unknown |
|
|
B.4 |
Data Cataloguing |
An interface that provides operations to associate a data object with one or more metadata objects which contain data descriptions. |
Unknown |
Yes |
Yes |
Unknown |
Not Applicable |
Unknown |
|
|
B.5 |
Data Product Generation |
An interface that provides operations to process data against requirement specifications and standardised formats and descriptions. |
Yes |
Yes |
Yes |
Yes |
Not Applicable |
Yes |
|
|
B.6 |
Data Versioning |
A interface that provides operations to assign a new version to each state change of data, allow to add and update some metadata descriptions for each version, and allow to select, access or delete a version of data. |
Yes |
Unknown |
Unknown |
Unknown |
Not Applicable |
Unknown |
|
|
B.7 |
Workflow Enactment |
An interface that provide operations or services to interprets predefined process descriptions and control the instantiation of processes and sequencing of activities, adding work items to the work lists and invoking application tools as necessary. |
No |
Yes |
Unknown |
Yes |
Not Applicable |
No |
|
|
B.8 |
Data Storage & Preservation |
An interface that provides operations to deposit (over long-term) the data and metadata or other supplementary data and methods according to specified policies, and make them accessible on request. |
Yes |
Yes |
Yes |
Yes |
Not Applicable |
Yes |
|
|
B.9 |
Data Replication |
An interface that provides operation to create, delete and maintain the consistency of copies of a data set on multiple storage devices. |
No |
Yes |
Unknown |
Yes |
Not Applicable |
Yes |
|
|
B.10 |
Replica Synchronisation |
An interface that provides operations to export a packet of data from on replica, transport it to one or more other replicas and to import and apply the changes in the packet to an existing replica. |
No |
Unknown |
No |
Unknown |
Not Applicable |
Yes |
|
|
C |
Data Access Sub-System |
||||||||
|
No |
Functions |
Definitions |
ICOS |
EPOS |
EMSO |
EISCAT-3D |
LifeWatch |
EURO-Argo |
|
|
C.1 |
Access Control |
An interface that provides operations to approve or disapprove of access requests based on specified access policies. |
Unknown |
Yes |
Unknown |
Yes |
Unknown |
Unknown |
|
|
C.2 |
Resources Annotation |
An interface that provides operations to create, change or delete a note that reading any form of text, and to associate them with a computational object. |
No |
No |
No |
No |
Yes |
No |
|
|
C.3 |
(Data) Annotation |
A specialisation of Resource Annotation which allows to associate an annotation to a data object. |
Yes |
Yes |
Yes |
No |
Yes |
No |
|
|
C.4 |
Metadata Harvesting |
An interface that provides operations to (regularly) collect metadata (in agreed formats) from different sources. |
Unknown |
Unknown |
Yes |
No |
Unknown |
No |
|
|
C.5 |
Resource Registration |
An interface that provides operations to create an entry in a resource registry and insert resource object or a reference to a resource object in specified representations and semantics. |
|
|
|
|
|
|
|
|
C.6 |
(Metadata) Registration |
A specialisation of Resource Registration, which registers a metadata object in a metadata registry. |
Unknown |
Yes |
Yes |
No |
Unknown |
No |
|
|
C.7 |
(Identifier) Registration |
A specialisation of Resource Registration, which registers an identifier object in an identifier registry. |
Unknown |
Unknown |
Yes |
No |
Unknown |
No |
|
|
C.8 |
(Sensor) Registration |
A specialisation of Resource Registration which registers a sensor object to a sensor registry. |
Unknown |
Unknown |
Yes |
No |
Yes |
No |
|
|
C.9 |
Data Conversion |
An interface that provides operations to convert data from one format to another format. |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
|
|
C.10 |
Data Compression |
An interface that provides operations to encode information using reduced bits by identifying and eliminating statistical redundancy. |
No |
No |
No |
No |
Yes |
No |
|
|
C.11 |
Data Publication |
An interface that provides operations to provide clean, well-annotated, anonymity-preserving datasets in a suitable format, and by following specified data-publication and sharing policies to make the datasets publically accessible or to those who agree to certain conditions of use, and to individuals who meet certain professional criteria. |
Yes |
Unknown |
Yes |
Unknown |
Yes |
Yes |
|
|
C.12 |
Data Citation |
An interface that provides operations to assign an accurate, consistent and standardised reference to a data object, which can be cited in scientific publications. |
No |
Unknown |
Yes |
No |
Unknown |
No |
|
|
C.13 |
Semantic Harmonisation |
An interface that provides operations to unify similar data (knowledge) models based on the consensus of collaborative domain experts to achieve better data (knowledge) reuse and semantic interoperability. |
No |
Yes |
Yes |
No |
Yes |
No |
|
|
C.14 |
Data Discovery and Access |
An interface that provides operations to retrieve requested data from a data resource by using suitable search technology. |
Yes |
Yes |
Yes |
Yes |
Yes |
Unknown |
|
|
C.15 |
Data Visualisation |
An interface that provides operations to display visual representations of data. |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
|
|
D |
Data Processing Sub-System |
||||||||
|
No |
Functions |
Definitions |
ICOS |
EPOS |
EMSO |
EISCAT-3D |
LifeWatch |
EURO-Argo |
|
|
D.1 |
Data Assimilation |
An interface that provides operations to combine observational data with output from a numerical model to produce an optimal estimate of the evolving state of the system. |
Yes |
Unknown |
Unknown |
Unknown |
Unknown |
Not Applicable |
|
|
D.2 |
Data Analysis |
An interface that provides operations to inspect, clean, transform data, and to provide data models with the goal of highlighting useful information, suggesting conclusions, and supporting decision making. |
Yes |
Yes |
Yes |
Yes |
Yes |
Not Applicable |
|
|
D.3 |
Data Mining |
An interface that provides operations to support the discovery of patterns in large data sets. |
Yes |
Unknown |
No |
No |
Yes |
Not Applicable |
|
|
D.4 |
Data Extraction |
A interface that provides operations to retrieve data out of (unstructured) data sources, including web pages ,emails, documents, PDFs, scanned text, mainframe reports, and spool files. |
Yes |
Unknown |
Unknown |
Yes |
Yes |
Not Applicable |
|
|
D.5 |
Scientific Modelling and Simulation |
An interface that provides operations to support of the generation of abstract, conceptual, graphical or mathematical models, and to run an instance of the model. |
Yes |
Yes |
Unknown |
Unknown |
Yes |
Not Applicable |
|
|
D.6 |
(Scientific) Workflow Enactment |
A specialisation of Workflow Enactment, which support of composition and execution a series of computational or data manipulation steps, or a workflow, in a scientific application. Important processes should be recorded for provenance purposes. |
No |
Unknown |
No |
No |
Yes |
Not Applicable |
|
|
D.7 |
(Scientific) Visualisation |
An interface that provides operations to graphically illustrate scientific data to enable scientists to understand, illustrate and gain insight from their data. |
Unknown |
Yes |
Yes |
Yes |
Yes |
Not Applicable |
|
|
D.8 |
Service Naming |
An interface that provides operations to encapsulate the implemented name policy for service instances in a service network. |
No |
Unknown |
No |
No |
Yes |
Not Applicable |
|
|
D.9 |
Data Processing |
An interface that provides operations to initiate the calculation and manage the outputs to be returned to the client. |
No |
Unknown |
No |
No |
Yes |
Not Applicable |
|
|
D.10 |
Data Processing Monitoring |
An interface that provides operations to check the states of a running service instance. |
No |
Unknown |
No |
No |
Yes |
Not Applicable |
|
|
E |
Community Support Sub-System |
||||||||
|
No |
Functions |
Definitions |
ICOS |
EPOS |
EMSO |
EISCAT-3D |
LifeWatch |
EURO-Argo |
|
|
E.1 |
Authentication |
An interface that provides operations to verify a credential of a user. |
Yes |
Yes |
Unknown |
Yes |
Yes |
Unknown |
|
|
E.2 |
Authorisation |
An interface that provides operations to specify access rights to resources. |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
|
|
E.3 |
Accounting |
An interface that provides operation to measure the resources a user consumes during access for the purpose of capacity and trend analysis, and cost allocation. |
No |
Unknown |
Yes |
No |
Unknown |
No |
|
|
E.4 |
(User) Registration |
A specialisation of Resource Registration which registers a user to a user registry. |
No |
Unknown |
Unknown |
No |
Yes |
Unknown |
|
|
E.5 |
Instant Messaging |
An interface that provides operation for quick transmission of text-based messages from sender to receiver. |
No |
Unknown |
No |
No |
Yes |
No |
|
|
E.6 |
(Interactive) Visualisation |
An interface that provides operations to enable users to control of some aspect of the visual representations of information. |
No |
Yes |
Yes |
Yes |
Yes |
No |
|
|
E.7 |
Event Notification |
An interface that provide operations to deliver message triggered by predefined events. |
No |
Yes |
Yes |
No |
Yes |
No |
|
Note, the italicised texts in the table distinguish a specialisation from a particular function.
For the purpose of definition, most of operations are defined as activities performed on an individual object. Sometimes, bulk operations are requested to handle a collection of objects in order to achieve performance. Once requested, bulk functions can be added-on, which will not change the concept of a function fundamentally.
The consistency and completeness of above list of functions will be examined in a follow-on task, T3.3, where we will use ODP activities diagrams to model each process, and to identify missing functional elements. For example, an Authorisation function in a community support sub-system which defines the access policies implies an Access Control function in a data access sub-system to enforce the access policies, thus an Access Control function will be added when this is evaluated as necessity.
Above analysis helps us gain better understanding about the common architectural characteristics and functionalities by examining the ENV RIs from the aspects of Engineering and Computational Viewpoints . In this section, we look at the ENV RIs from the ODP Enterprise Viewpoint .
The Enterprise Viewpoint concerns about the organisational and social context, and scientific processes. It captures the purpose, scope and policies of a system. In order to do that, the system is represented by one or more enterprise objects within a community , and by the roles in which these objects are involved. Using these concepts, in the following, we identify the common communities of ENV RIs and community roles .
We distinguish 4 communities: Data Acquisition , Data Management , Data Service Provision , and Data User . The division of the 4 communities is based on their main objectives and activities.
Now, we can examine for each community which roles they may have. Table 4.1 lists common community roles which are either identified from the explicit descriptions in the documentation of ENV RIs or derived from computational functions provided by ENV RIs.
Table 4.1 : The Common Roles
|
RA |
Data Acquisition Community |
||||||
|
No |
Roles |
ICOS |
EPOS |
EMSO |
LifeWatch |
EISCAT-3D |
EURO-Argo |
|
RA.1 |
Ecosystem and environmental resource managers |
Unknown |
Unknown |
Unknown |
Yes |
Unknown |
Unknown |
|
RA.2 |
Conservation managers |
Unknown |
Unknown |
Unknown |
Yes |
Yes |
Yes |
|
RA.3 |
Designer for measurements and monitoring models |
Yes |
Unknown |
Yes |
No |
Yes |
No |
|
RA.4 |
Technician for the development and deployment of the sensor and sensor network |
Yes |
Unknown |
Yes |
No |
Yes |
Yes |
|
RA.5 |
Technician for the operation and maintenance of the sensor and sensor network |
Yes |
Yes |
Yes |
No |
Yes |
Yes |
|
RA.6 |
Observer/Measurer/Data collector |
Yes |
Yes |
Yes |
No |
Yes |
Yes |
|
RA.7 |
Research scientists in data quality control |
Yes |
Unknown |
Unknown |
No |
Unknown |
Yes |
|
RB |
Data Management Community |
||||||
|
No |
Roles |
ICOS |
EPOS |
EMSO |
LifeWatch |
EISCAT-3D |
EURO-Argo |
|
RB.1 |
Storage Manager |
Yes |
Yes |
Yes |
No |
Yes |
Yes |
|
RB.2 |
Curator/Data Manager |
Yes |
Yes |
Yes |
No |
Yes |
Yes |
|
RB.3 |
Data Publisher |
Yes |
Yes |
Yes |
Yes |
No |
No |
|
RB.4 |
External Data Provider |
Yes |
Unknown |
Yes |
No |
No |
No |
|
RC |
Data Service Provision Community |
||||||
|
No |
Roles |
ICOS |
EPOS |
EMSO |
LifeWatch |
EISCAT-3D |
EURO-Argo |
|
RC.1 |
Data Provider |
Yes |
Yes |
Yes |
Yes |
No |
No |
|
RC.2 |
Data Service Provider |
Yes |
Yes |
Yes |
Yes |
No |
Yes |
|
RC.3 |
Other RIs and Networks with interests overlapping domain |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
|
RD |
Data User Community |
||||||
|
No |
Roles |
ICOS |
EPOS |
EMSO |
LifeWatch |
EISCAT-3D |
EURO-Argo |
|
RD.1 |
Internal Scientist/researcher who perform in-house experiments/analyses |
Yes |
Unknown |
Yes |
Yes |
Yes
|
No |
|
RD.2 |
External Scientist/researcher |
Yes |
Yes |
Yes |
Yes |
No |
Yes |
|
RD.3 |
Technologist/engineer |
Yes |
Yes |
Yes |
Yes |
Yes |
No |
|
RD.4 |
Education/trainee |
Yes |
Yes |
Yes |
Yes |
No |
No |
|
RD.5 |
Police/decision maker |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
|
RD.6 |
Private sector (Industry investor/consultant) |
Yes |
Yes |
Yes |
Yes |
No |
No |
|
RD.7 |
General public/media/citizen (scientists) |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Lacking of sufficient resources, the analysis here is very brief. We leave the unfilled spaces for future explorations.
The goal of this investigation was to identify the common requirements of the ENV RIs. Throughout the study, ODP has been used as the analysis framework, which serves as a uniform platform for interpretation and discussion to ensure a unified understanding. From the aspect of the ODP Engineering Viewpoint , the architectural characteristics of the RIs have been examined, and 5 common sub-systems have been identified: sub-systems of data acquisition , curation , access , processing and community support . Secondly, from the aspect of the ODP Computational Viewpoint , we looked at each of the 6 RIs in details and identified the common functions and embedded computations they provided. Matrices has been used for comparison. Definitions of functionalities have been provided. Finally, from the aspect of the ODP Enterprise Viewpoint , we have identified 4 common communities , and derived the community roles .
The results from this study can be delivered as an input to a design or an implementation model. Common services can be provided in the light of the common analysis, which can be widely applicable to various environmental research infrastructures.
There are several elements which could be extended in future work:
Much help and resources have been obtained from the following people and projects, great thanks to them:
[9] B. Gustavsson, "Visualization for EISCAT 3D," EISCAT-3D Deliverable, 2006.
[10] Euro-Argo website : http://www.euro-argo.eu/About-us , Retrieved Dec. 2012.
[11] "Argo Data Management," in EURO-Argo 1st User workshop , Presentation, Southampton, 2008.
[15] EMSO website : http://www.esonet-noe.org/Gallery/Movies/Deep-sea-observatories-internet-in-the-ocean , Retrieved Dec. 2012.
[16] "Orfeus FDSN Report 2004," Report, 2004.
[17] P. Martin, "EPSO Answer," ENVRI wiki notes, unpublished, 2012.
[18] Orfeus website: Data Quality : http://www.orfeus-eu.org/Data-info/dataquality.html , Retrieved Dec. 2012.
[19] P. Martin, "EPSO Use Case," ENVRI wiki notes, unpublished, 2012.
[20] K. Jeffery, T. L. Hoffmann, "Report on EPOS e-Infrastructure Requirements," Report, 2011.
[21] M. B. de Bianchi, and J. Saul, "EIDA: European Integrated Data Archives," Presentation, 2012.
[22] D. Bailo, and T. L. Hoffmann, "D6.2 Annex 2: EPOS use cases," Report, 2011.
[23] Orfeus website: Software : http://www.orfeus-eu.org/Software/softwarelib.html , Retrieved 2012.
[2] EPOS: http://www.epos-eu.org/
[4] EISCAT-3D: http://www.eiscat3d.se/
[5] LifeWatch: http://www.lifewatch.eu/
[6] Euro-Argo: http://www.euro-argo.eu/
[7] MyOcean: http://www.myocean.eu.org/
[8] SeaDataNet: http://www.seadatanet.org/
[9] ICOS web portal: https://icos-atc-demo.lsce.ipsl.fr
[10] EUROSITES: http://www.eurosites.info/about.php
[11] PANGAEA: http://www.pangaea.de/
[12] MOIST: http://moist.rm.ingv.it/
[13] EMSO common data catalogue: http://dataportals.pangaea.de/emso
[14] panFMP: www.panfmp.org
[15] PANGAEA data curation and management: http://wiki.pangaea.de/wiki/Project_data_management
[16] PANGAEA Pan2Applic: http://wiki.pangaea.de/wiki/Pan2Applic
[17] PANGAEA PanTool: http://wiki.pangaea.de/wiki/PanTool
[18] PANGAEA Split2Events: http://wiki.pangaea.de/wiki/Split2Events
[19] PANGAEA PanPlot: http://wiki.pangaea.de/wiki/PanPlot
[20] PANGAEA PanMap: http://wiki.pangaea.de/wiki/PanMap
[21] PANGAEA GIS: http://wiki.pangaea.de/wiki/GIS
[22] PANGAEA data import tool: http://wiki.pangaea.de/wiki/Import
[23] PANGAEA advanced metadata discovery: http://www.pangaea.de/
[24] PANGAEA data citation: http://wiki.pangaea.de/wiki/Citation
[25] PANGAEA data policy: http://wiki.pangaea.de/wiki/Data_policy
[26] PANGAEA data quality control: http://wiki.pangaea.de/wiki/Data_policy#Quality_assurance
[28] PANGAEA online metadata & data submission service: http://www.pangaea.de/submit/
[29] PANGAEA online curation editor: http://wiki.pangaea.de/wiki/4D
[30] EMSC Earthquake Notification Service: http://www.emsc-csem.org/Earthquake/seismicity/real_time.php
Stichting EGI에게 부여된 무료 Atlassian Confluence Community License로 실행됩니다. 오늘 Confluence를 평가해 보세요.