Return to ENVRI Community Home![]()
Complete ACTRIS report on Processing available at: https://envriplus.manageprojects.com/projects/requirements/notebooks/470/pages/36/comments/389/attachments/611/download
Inputs
The input data are numbers, which are stored in NetCDF files as matrix. The input data is around 100KB (per station and per component and per file). Currently, the data rate is defined as:
30 stations, 3 measures, 100 KB per file, per week à 30 * 100KB * 3 * per week.
The data is quite heterogeneous within the components (EUSAAR, EARLINET, CLOUDNET), but of course, the data of each component is different from the others. The most of the following refers to ACTRIS aerosol vertical component.
Data are made available for the analytics phase using http and/or https protocols [examples in the active collab].
Analytics
Data processing, will need 1050 processing per month, divided in two steps (which can be done in parallel):
The programming languages that ACTRIS uses are Python, C and 3Pascal, and it uses 3 Linux servers as a platform as Hardware. Each Linux server has with 4 core processor, and 16 GB RAM. The software requirements are: Linux, Open Source software, 3 Pascal compiler, and many other libraries (e.g. NetCDF libraries).
ACTRIS plans to have all software that they provided with an open source license. In such a way, that everyone can use them and contribute to the processes/algorithm. But, a coordinator will be needed to review contributions perform by users.
ACTRIS recognised that will be a good idea to have a sandbox as stable process/algorithm, and use it to compare others.
ACTRIS mostly uses interactive-processing mode, and developers could use a monitor console.
ACTRIS does not use workflows and does not reuse sub-process across processes.
Output
ACTRIS has 5 different topologies (each one for capturing different aerosol optical properties); each one produces datasets around 100KB. Besides, it also produces images. The data rate is: 30 stations * 100 KB * 5 topologies * per week
For making available the analytics outcomes, ACTRIS expect to use the website and protocols like http and https.
Statistical
CLOUDNET and EUSAAR component have automatic collection of data (continues monitoring). However, EARLINET component has scheduler for measurements (no automatic collection), which allows configuring the collection of data with different hypothesis in mind, which can be refined later (for EARLINET).
To analysis the responses ACTRIS uses all data available, which can be continuous for EUSAAR and CLOUDNET components and discrete for EARLINET component. However, ACTRIS is trying to have continues data for EARLINET too. The data is bounded on regional region (just European regions). For profiling data, ACTRIS is limited into vertical: upper atmosphere and stratosphere. Currently, the EARLINET database has 505 files. But, for EARLINET it is expected to grow the data to 12GB/year.
ACTRIS is involved with the GAIA-CLIM project [1], which its aim is to improve the ability to use ground-based and sub-orbital observations to characterise satellite observations for a number of atmospheric Essential Climate Variables (ECVs). ACTRIS is working there in the measurements errors with data mining paradigm.
ACTRIS is planning for the next year to set up some data quality check (e.g. to check if there is any anomalies or strange values in the data or anomalies into the atmosphere) for the 3 components.
ACTRIS wants to work with different approaches and understand the different between them approaches. Therefore, depending of the situations, ACTRIS wants to use the most suitable approach.
| Go-between | Rosa Filgueira |
|---|---|
| RI representative | Lucia Mona and Markus Fiebig |
| Period of requirements collection | July to November 2015 |
| Status | Finished |