Return to ENVRI Community Home![]()
Members of the ENVRI+ community sometimes confuse with the term e-Infrastructure. What are e-Infrastructures? In the framework of the Joint Information Systems Committee (JISC) e-infrastructure programme, e-Infrastructures are defined in terms of integration of networks, grids, data centers and collaborative environments, and are intended to include supporting operation centers, service registries, credential delegation services, certificate authorities, training and help-desk services.
The European Strategy Forum on Research Infrastructures (ESFRI) presented the first European roadmap for new, large-scale Research Infrastructures[1]. These are modeled as layered hardware and software systems which support sharing of a wide spectrum of resources, spanning from networks, storage, computing resources, and system-level middleware software, to structured information within collections, archives, and databases. The e-Infrastructure Reflection Group (e-IRG) has proposed a similar vision. In particular, it envisions e-Infrastructures where the principles of global collaboration and shared resources are intended to encompass the sharing needs of all research activities.[2]
There is a long tradition to develop e-Infrastructures in Europe, and to try to connect them into continent wide e-Infrastructures. This to allow researchers from different countries to work together using the same computers. Important pan-European large-scale e-Infrastructures include, EGI, EUDAT, PRACE, GEANT, and OpenAIRE. Each has own special focused areas, e.g., EGI provides pan-European federated computing and storage resources; PRACE federates pan-European High Performance Computing (HPC) resources; EUDAT is the so called Data- infrastructure, which is a digital infrastructure promoting data sharing and consumption. It is one type of e-Infrastructure but focus on providing services and technology to support the life-cycle of data. GEANT is the pan-European data network for the research and education community, interconnecting national research and education networks (NRENs) across Europe. OpenAIRE is a network of Open Access repositories, archives and journals that support Open Access policies.
These e-Infrastructures provide generic IT resources and services solutions supporting various European scientific research activities. The benefits to adopt and make good use of these resources for a scientific community and a research infrastructure include but not limited to:
ENVRI+ has already been collaborating with these pan-European e-Infrastructures, such as EGI and EUDAT. In WP9, EGI will provide computing and storage resources for deploying services developed by ENVRI+ JRAs WPs. EUDAT services are also chosen (by some of Research Infrastructures) for data management. However, interoperable access to these e-Infrastructures remain as a challenging issue. In this sense, ENVRI+ is in good position to provide real use cases/requirements to influence the future implementations of these e-Infrastructures.
This section gives an overview of current e-infrastructure for European academic research, along with some of the currently anticipated developments and innovations going forward. The focus is on pan-European infrastructure, reflecting the scale of the Research Infrastructures (RI) represented in ENVRI+. In general all of the current European scale e-infrastructures seek to include partners in all European Member States, thereby providing a one-stop-shop for continental scale interactions while at the same time providing access to local/regional activities in the individual Member States. At a European level, the e-infrastructure is often presented as a layered model, with the layers representing:
We follow this view and focus on the first 3 layers here, as layer 4 is probably best represented by ENVRI+ and its member projects directly.
The review of this topic will be organised by in consultation with the following volunteers: . They will partition the exploration and gathering of information and collaborate on the analysis and formulation of the initial report. Record details of the major steps in the change history table below.For further details of the complete procedure see item 4 on the Getting Started page.
Note: Do not record editorial / typographical changes. Only record significant changes of content.
| Date | Name | Institution | Nature of the information added / changed |
|---|---|---|---|
The model for research and education networking in Europe is of a single national entity per country (the National Research and Education Network – NREN) connecting to a common pan-European backbone infrastructure. In combination these networks provide a powerful tool for international collaborative research projects – particularly those with demanding data transport requirements. NRENs are able to connect individual sites to their high-bandwidth infrastructures or arrange point-to-point services for bilateral collaborations. GÉANT provides a single point of contact to coordinate the design, implementation and management of network solutions across the NREN and GÉANT domains.
The GÉANT network (like the majority of NRENs) has a hybrid structure – operating a dark-fibre network and transmission equipment wherever possible and leasing wavelengths from local suppliers in more challenging regions. This structure allows the operation of both IP and point-to-point services on a common footprint. Since 2013, GÉANT has migrated to a new generation of both transmission and routing equipment platforms. The resulting network is seen as a significant increase in the bandwidth available along with an improved range of network services. GÉANT’s pre-provisioned capacity on each of the core network trunks (covering western and central Europe) is around 500Gbps and an advanced routing/switching platform delivers IP, VPN and point-to-point services with greater flexibility to all European NRENs.
The GÉANT project provides more than just a physical network infrastructure. Its service development and research activities address directly the needs of the R&E community both by providing advanced international services on the NREN and GÉANT backbones, and also by developing software and middleware to target network-related issues from campus to global environments. The GÉANT backbone currently offers:
Services under development in GÉANT include:
For full details of GÉANT services see http://www.geant.org/Services.
GÉANT operates an infrastructure connecting NRENs in the vast majority of countries across Europe. These NRENs each have extensive national infrastructure and provide connections to universities, research centres and other not-for-profit institutions.
Seven new NRENs have joined GÉANT in 2013 from Eastern Europe and will be working to improve their international interconnection.
A full list of GÉANT NRENs can be found at http://www.geant.org/About/Membership/Pages/MAandGAreps.aspx.
In addition to its pan-European reach, the GÉANT network has extensive links to networks in other world regions including North America, Latin America, the Caribbean, North Africa and the Middle East, Southern and Eastern Africa, the South Caucasus, Central Asia and the Asia-Pacific Region. In addition, there is on-going work to connect to Western and Central Africa.
A full list of countries that interconnect with GÉANT can be found at http://www.geant.org/Networks/Global_networking/Pages/Home.aspx.
PRACE (http://www.prace-ri.eu) provides high-end computing resources to European top science. The largest 3-5 PRACE systems are generally referred to as “tier-0” These systems are in general significantly larger than other European computer systems accessible to researchers. The resources are accessible to applicants with successful proposals submitted in response to Calls for Proposals. The "Guide for Applicants to Tier-0 Resources" on the PRACE website (http://www.prace-ri.eu/HPC-access) provides detailed information on preparing applications and the peer review process that follows the submission. Post Award obligations include a final report and acknowledgement of PRACE support. PRACE publishes twice-yearly Calls for Proposals, in February and in September. Preparatory access proposals, allowing users to develop software or test out novel ideas, are accepted at any time, with access granted on a quarterly basis.
The first phase of PRACE ended in mid 2015. PRACE now is in the second phase during which prototypes for the three most promising solutions will be built. Phase three is expected to start in early 2016 during which pre-comercial small scale product will be developed.
In addition to providing access to very large Tier-0 HPC resources, PRACE also pools some national level (Tier-1) resources and makes them available through specific calls. PRACE implementation projects include a range of activities that are likely to be interesting for the biological and medical community: training courses, software development, technology tracking, and access to prototype resources.
PRACE implementation projects include a range of activities that are likely to be interesting for the biological and medical science communities: training courses, software development, technology tracking and access to prototype resources. Three implementation projects have already been carried out (PRACE 1IP-3IP) and the fourth (PRACE 4IP) was funded in March 2015. PRACE 4IP aims to contribute to the biomedical application development, training needs and data intensive computing requirements, to name a few examples.
It is important to note that the explosion in the data generation capacity of scientific equipment and sensors is creating a new class of researchers who have different demands in terms of their use of computing power and of how and where their data is stored. Traditionally, users needed PRACE to develop tools to generate data, for modelling and simulations, which had to be kept to compare with other models. In contrast, the new type of users wants to analyse data generated elsewhere and tends not to have a strong background in computing. It is important to understand these users’ requirements, in particular concerning how the data will be used, preserved and stored in the long-term.
The European Grid Infrastructure (EGI; http://www.egi.eu/) is a collaboration of computing resource providers that delivers integrated computing services to European researchers. The infrastructure is a publicly funded e-infrastructure which provides European scientists access to more than 500,000 logical CPUs, >290PB of disk space, 180 PB of tape space and data. Currently has ~32,000 user base and more than 1.5 M compute jobs are processed every day in EGI.
EGI supports computing (including closely coupled parallel computing normally associated with HPC), compute workload management services, data access and transfer, data catalogues, storage resource management, and other core services such as user authentication, authorisation and information discovery that enable other activities to flourish. Resources are provided by over 350 resource centres that are distributed across 52 countries in Europe, the Asia-Pacific region, Canada, and Latin America. User communities gain access to EGI services by partnering with EGI, either directly through federating their own resource centres, or indirectly by accessing national or regional resource centres that already support their communities.
Existing high-level services:
High-level services under development:
Project positioning with respect to similar initiatives
EGI matured its portfolio of solutions that help accelerate data-intensive research. The most relevant developments in EGI for ENVRI+ infrastructures are:
1. Launch of EGI Federated Cloud
After nearly two years of development the EGI community opened the ‘EGI Federated Cloud’ as a production infrastructure in May 2014. The new infrastructure (http://go.egi.eu/cloud) is based on open standards and offers unprecedented versatility and cloud services tailored for European researchers. It is a connected grid of institutional clouds built around open standards. With the EGI Federated Cloud, researchers and research communities can:
Since its launch, the EGI Federated Cloud has attracted more than 35 use cases10 from various scientific projects, research teams and communities. Among these there are several applications from environmental sciences:
2. Simplifying access to EGI for the ‘long tail of science’
While processes to gain access to EGI are well established across the NGIs for entire user communities, individual researchers and small research teams sometimes struggle to access compute and storage resources from the network of NGIs for the implementation of ‘big data applications’. Recognising the need for simpler and harmonised access for individual researchers and small research groups, i.e. the ‘long tail of science’, the EGI community started to design and prototype a new platform in October 2014. The platform will provide integrated services from the NGIs to those researchers and small research teams who work with large data but have limited or no expertise in using distributed systems. The platform will lower the barrier to access grid and cloud infrastructure via a centrally operated access management portal and an open set of virtual research environments designed for the most frequent use cases. The project defines security policies and implements new security services that enable personalised, secure and yet simple access to einfrastructure resources via the virtual research environments for individual users. The platform will authenticate users via the EduGAIN federation and other username–password based mechanisms, complementing the long established certificate-based access mechanisms. The prototype system is launched in Dec 2015.(https://access.egi.eu/start)
3. End of EGI-InSPIRE, start of EGI-Engage
EGI’s first nearly 5 years were supported by the ‘EGI-Integrated Sustainable Pan-European Infrastructure for Research in Europe’ (EGI-InSPIRE) FP7 project. EGI-InSPIRE came to an end in December 2014. A new initiative, EGI-Engage was funded by the European Commission for support under the H2020 framework programme. EGI-Engage was launch in March 2015 with a total budget of 8.7 million Euros for 2.5 years.
One of the main objective of EGI-Engage is to expand the capabilities of EGI (e.g. cloud and data services) and the spectrum of its user base by engaging with large Research Infrastructures (RIs), the long tail of science, and with industry/SMEs. The key engagement instrument for this is a network of eight Competence Centres, in which National Grid Initiatives (NGIs), user communities, technology and service providers are join-forces to collect requirements, integrate community-specific applications into state-of-the-art services, foster interoperability across e-infrastructures, and evolve services through a user-centric development model. The competence centres will provide state-of-the-art services, training, technical user support and application co-development to specific scientific domains. The following science communities have dedicated Competence Centres in EGI-Engage:
1) Earth-science research (EPOS)
2) EISCAT 3D
3) Life-science research (ELIXIR)
4) Biodiversity and ecosystem research (LifeWatch)
5) Biobanking and medical research (Biobanking and Bimolecular Research Infrastructure, BBMRI-ERIC),
6) Structural biology and brain imaging research (MoBrain supporting WeNMR and Integrating Structural Biology – INSTRUCT)
7) Arts and Humanlity (DARIAH)
8) DisasterMitigation
The stimulus provided by publicly funded research organisations (CERN, EMBL and ESA) via the Helix Nebula initiative has led to the creation of its first product: the Helix Nebula Marketplace (HNX, http://hnx.helix-nebula.eu/) with considerable investment by the commercial cloud service providers and the active engagement of the publicly funded European e-infrastructures GÉANT and EGI.
HNX is still in its early phase but has already shown the value of such public/private partnerships where, driven by the common procurement needs of research organisations, Europe’s IT industry has shown it is willing to invest. The demand-side has continued its deployment testing and procurement activities with the Helix Nebula Marketplace (HNX) product during 2014. With approximately 30% of its membership being SMEs, the Helix Nebula initiative is providing a channel by which innovative cloud service companies can work with major IT companies and public research organisations. Helix Nebula has published an update to the strategic plan (http://www.helix-nebula.eu/publications/deliverables/d92-strategic-plan-scientific-cloudcomputing-infrastructure-europe-three) that launched the initiative in 2011 and a roadmap for future developments (http://www.helix-nebula.eu/publications/deliverables/d91-roadmap-of-future-developments). The intention is to spread the scope of the Helix Nebula initiative to become a forum between the supply-side and the demand-side where issues of common interest (such as procurement models, contractual frameworks, service platforms etc.) can be addressed.
The rate of adoption of open source cloud software stacks, in particular OpenStack, means they are rapidly becoming de-facto standards in both the enterprise and public sector domains. EGI has continued to develop the prototype EGI Fed Cloud which has been tested with several user communities and integrated into HNX. The integration has been tested with flagship applications from CERN and ESA. The experience gained from this work has shown that integrating publicly funded e-infrastructures with commercial services to provide a combined platform on which to build new services has a clear value for users. Two developments would provide the basis for further integration:
The provision of services via supplier-funded resources is the foundation of national and European e-infrastructures. The concept of a marketplace with the ability for users to choose from a range of services and suppliers can offer a practical implementation of the concept of an e-infrastructure commons. The exchange approach coupled with the pay-per-use model, as championed by Helix Nebula, is being considered by a number of e-infrastructures.
Through the work of initiatives such as Helix Nebula it has become clear that it is essential to separate the roles of end-user (the researcher making use of the services) and customer (the organisation sponsoring the consumption of the services by the end-user) and ensure that services, both commercial and publicly funded, are offered free at the point of use.
Publicly funded e-infrastructures are investigating a new brokering role to facilitate access to commercial services for their user base. A key attraction of the brokerage role is that it offers a new revenue stream. While these business model innovations should be encouraged it is important that competition between brokers does not lead to renewed fragmentation of the einfrastructure commons. An aspect of brokerage which is often underestimated by the publicly funded e-infrastructures is the necessary financial engagement and liability. Experienced financial brokers from utility markets could help ensure the good governance of the exchange and take on board some of the financial risks from users and suppliers to accelerate the expansion of the market.
EUDAT is a pan-European data infrastructure initiative. EUDAT brings together a large consortium of 33 partners, including research communities, national data and high performance computing (HPC) centres, technology providers, and funding agencies from 14 countries. EUDAT aims to build a sustainable cross-disciplinary and cross-national data infrastructure that provides a set of shared services for accessing and preserving research data.
EUDAT develops solutions for data coupled computing, including big data frameworks and workflow systems for initiating computing tasks on datasets
located in the EUDAT infrastructure. EUDAT B2STAGE library allows to stage data to HPC computing environments and it is being developed further to add support for Hadoop and Spark big data systems.
Currently, EUDAT is working with more than 30 scientific communities and has built a suite of five integrated services to assist them in resolving their grand challenges. In the Life Science domain, EUDAT is currently working with research communities such as ELIXIR, BBMRI, ECRIN, DiXa, and VPH. Covering both access and deposit, from informal data sharing to long-term archiving, and addressing identification, discoverability and computability of both long-tail and big data, EUDAT services aim to address the full lifecycle of research data.
The current suite of EUDAT B2 services are:
The EUDAT project operates in a European landscape of developing or already existing data infrastructures. These research infrastructures already have developed solutions and tools for managing their data. The goal of EUDAT is not to replace these infrastructures, but to support and enrich them by proving strong data infrastructure component and generic services on which they can rely to build up their data strategy. EUDAT’s vision is to enable European researchers and practitioners from any research discipline to preserve, find, access, and process data in a trusted environment, as part of a Collaborative Data Infrastructure (CDI) conceived as a network of collaborating, cooperating centres, combining the richness of numerous community-specific data repositories with the permanence and persistence of some of Europe’s largest scientific data centres. At the heart of the CDI is a network of distributed storage systems hosted at the major scientific data centres. Between them, these centres manage more than 100 PB of highperformance, online disk in support of European research, plus an even greater amount of near-line tape storage. EUDAT’s strength lies in the connections between these centres, the resilience resulting from the geographically distributed network, and its ability to store research data right alongside some of the most powerful supercomputers in Europe.
According to the CDI model, two categories of users can be established:
Together, EUDAT and OpenAIRE are driving international cooperation in tackling issues around large-scale data infrastructures through the recently formed international Research Data Alliance (RDA). The RDA is an international collaboration including participants from all around the world. In addition to EUDAT and OpenAIRE, the EC and NSF are directly represented in RDA. In Europe, the work of the RDA is supported by the iCORDI RDAEurope project (coordinator Hilary Hanahoe, Trust-IT, UKCSC Finland). The RDA aims to accelerate and facilitate research data sharing and exchange. The work of the RDA is primarily be undertaken through its working groups. Participation in working groups, starting new working groups, and attendance at the twice-yearly plenary meetings is open to all.
EGI developed its ‘Open Data Commons’ vision inspired by the emerging open access policy in the European Research Area. The goal of open access it to ensure that research results are made available free of charge to endusers and that are reusable. Research results thus become a shared community resource (i.e., a commons). In order for this to happen, researchers need to change their own behaviours and they need to be supported with services that simplify the sharing of research results, their discovery and reuse. In the EGI-Engage project (starting in March 2015) EGI will develop the concept of a federated open research data platform, an innovative solution enabling to publish data, link to open access repositories, and offering easy integration into processing capabilities (e.g. EGI Federated Cloud). Furthermore, the federated cloud infrastructure, including existing publicly funded institutional cloud and expanding to commercial clouds, will evolve to offer IaaS, PaaS and SaaS16 for specific communities, the long-tail of research and the industrial/SME sector. In collaboration with other e-infrastructures, services will be tailored to meet the needs of the long tail of research and their evolution will be driven by the requirements of the RIs on the ESFRI roadmap that participate in the EGI Engage project through Competence Centres.
The three main objectives of OpenAIRE are to: