Tasks of DARA

Tasks: Overview


Introduction

One of the major tasks of the European Statistical System (ESS) is to provide EU institutions, Member States and the public with reliable information about the society, economy, environment and development in the European Union. Therefore it is necessary to collect process and analyze statistical data from the Member States. It is an elaborate procedure to harmonize the questionnaires, the preparation process and the data delivery of 27 Member States in the EU so that comparable information between Member States and useful aggregated values for the entire EU is achievable. In consequence of this, these data represent a precious and unique data source for the international research community although the full potential of these data is not exhausted with the publication of standardized aggregated tables. Furthermore there is a lot more analysis potential to model the European integration process, differences between Member States or best practice examples with multivariate and sophisticated evaluation.
By now the Community authority may grant access to the 12 following surveys for research purposes:
European Community Household Panel, Labour Force Survey, Community Innovation Survey, Continuing Vocational Training Survey, Structure of Earnings Survey, European Union Statistics on Income and Living Conditions, Adult Education Survey, Farm Structure Survey, European Health Interview Survey, Community Statistics on Information Society – module 2 Individuals, households and information society, Household Budget Survey and Statistical returns in respect of the Carriage of Goods by Road.
The access to these confidential data can be provided by sets of anonymised microdata or in the premises of Eurostat in the safe centre in Luxembourg. Because of the changing demands of user needs the request for microdata tends in the direction of original microdata without direct identifiers. The anonymised microdata are a valuable source and the advantage is that they can be used in the researchers own institutions. But for a lot of researchers the data sets are not detailed enough to conduct their analysis. They prefer to access the original microdata without direct identifiers. And by now there is only the possibility for European researchers to travel to Luxembourg what means a heavy burden, caused by the local constraints.
One way to guarantee decentralised access for researchers to Community Statistics is to provide access in the in the National Statistical Institutes (NSIs) of their Member States. The ESSnet-Project “Decentralised Access to EU Microdata Sets” has proven the feasibility to access confidential data in Eurostat through a remote desktop connection in the safe centres of the Member States. As a central result of this project the recommendation for the implementation of a pilot study was given.
And that’s what the project “ESSnet on Decentralised and Remote Access to Confidential Data in the ESS” deals with: the implementation of the remote access from safe centres in Member States to the community statistics in Eurostat.
The content of the project covers the goals of the 5 year statistical program of the EU, because one objective of the European Parliament and the Council is to “…promote increased usage of the Internet, not only for dissemination to end users but also for other parts of the statistical production process…”.
Another goal is to “…develop and implement policies and tools for harmonised confidentiality management in the ESS. In particular, harmonised means of optimal access for authorised researchers to anonymised micro data collected in order to produce Community statistics will be developed and implemented. The risk of disclosure will be adequately assessed, and technical means will be developed to facilitate access to and sharing of statistical data…” (cf. DECISION No 1578/2007/EC OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL of 11 December 2007 on the Community Statistical Programme 2008 to 2012).
In some Member States there is already a few years of experience with remote access to national data sets. These countries are good models for a remote access system and can accompany the implementation process.
It is planned to organize the ESSnet-project into two parts. The first part would be an administrative, organisational and preparing phase. The second one is an implementation phase where the connection of a safe centre via remote access to the remote access system in Eurostat (VIP) shall be executed. The organization is planned this way because the remote access platform in Eurostat is not ready yet to be accessed right from the beginning of this project.
The first part is an essential preparation phase and will also bridge the time gap until the remote access system is up and running at Eurostat. For the second phase this ESSnet project is completely depending on the infrastructure to connect the safe centres to the central node in Eurostat. If there is no infrastructure at Eurostat, after the preparation phase the project can’t proceed with the planned action. One alternative could be to tender a separate project to solve this issue.
The duration of the action shall take place within 24 month.

Tasks:

WP1: Documentation and Workflow

For a practical implementation of a remote access system from Member States to Eurostat a modification or revision of the legislation is necessary. Legally it is only allowed to provide access to community statistics in the premises of Eurostat and via anonymised microdata files. The option for remote access from a safe centre of a Member State to Eurostat is not foreseen yet. Therefore the working group on statistical confidentiality (WGSC) has convened a taskforce to assemble the revision of the EC 831/ 2002 regulation, after which remote access options from safe centres could be implemented. To access European community statistics in a safe centre of an NSI via remote access it is essential to have a thought-out documentation for the employees of the NSIs as well as for the researchers how to get access to these data. Also a workflow has to be developed, which contains a scheme of which steps need to be undertaken and who needs to be involved to provide access. Based on the recommendations in the final report of the project “Decentralised Access to EU Microdata Sets” the following tasks for the application process shall be considered in this WP:
The researcher applies for access at his local RDC. The local RDC then provides him with the standardised templates for access requests. Based on the filled out access requests, the local RDC checks the admissibility of the institute from Eurostat’s list. It also makes a recommendation on the project proposal. In making this recommendation, the local RDC will follow rules and considerations that Eurostat imposes. In an ideal system, the RDC will make the decision itself, based upon agreed standards. However, under current legislation, MS need to be consulted for agreement to the proposal. How the consultation process will be organised is part of the development of the workflow and will certainly depend on the revision of EC 831/ 2002 regulation. One could think of either Eurostat or the local RDC taking a coordinating role. After all MS have agreed to the research proposal, the local RDC takes care of the contract being signed by the researcher. If necessary, it then sends the contracts to Eurostat to be signed by them. The local RDC explains the use of the facility to the researcher and instructs him on disclosure control issues.

WP2: Concept of technical implementation and safety requirement for remote access

Before the technical implementation of remote access will be executed it has to be assured that the safety requirements will be fulfilled. A safe environment for the data was already defined within the accreditation system for safe centres. In this case the concept of technical implementation usually needs to be supported by the IT division and the data security officer. There is need for a detailed description of all settings for the remote access system. From national perspective on the one hand it has to be assured that it is not possible to take any physical data out of the NSI. On the other hand it has to be guaranteed that nobody else than the authorised researcher gets access to the community statistics and not even the employees of the NSIs. To tackle this issue in an adequate way, a system of risk management could be helpful. This shall be also addressed in this WP.
For this WP it is advisable to get in touch with and synchronize the work in the VIP-Project in Eurostat and the European Group Register (EGR) which is also going to connect to the remote access platform in Eurostat to assure that the interfaces will be working between possibly different systems. Of course, in this WP, all the feedbacks from the Eurostat RDC project will be considered as the technical environment in which the solutions should be deployed. And vice-versa, all the intermediary specifications of this WP will be sent to the EGR-VIP projects. For a good collaboration, at least a mid-term meeting (t+5) will be organised.

WP3: Cost benefit analysis

A cost benefit analysis gives an overview about the costs of the implementation and therefore it is possible to estimate what an up and running network will cost. Based on the experiences of already existing RDC’s it should be possible to calculate the costs of the hardware that allows the access to microdata either on a national server or via remote access. But the implementation of new ways of accessing community data surely leads to an increasing demand that causes an additional staff as well. Also for NSI’s that are aiming to implement a RDC, an estimation of the occurring costs should be useful.
WP4: Implementation of remote access – case study

In this WP the real implementation of a remote access connection from a safe centre in an NSI to the remote access platform in Eurostat shall be converted. Following the guidelines for data security in WP2, the connection from a thin client PC shall be set up. The objective of this WP is to implement a pilot in real conditions from a NSI. This covers the definition of the perimeter of the pilot: selecting the data, the authorisations, the users, the client configuration…
The end-point, the thin client PC is the most vulnerable element of the connexion channel. Because it is not under the control of the IT administrators or the data-producers (in case of decentralized access), it is important that the security level should be considered seriously.
For evaluation purposes, 10 PC thin client will be designed and configured with high level security constraints. The configuration will include: the PC, the monitor, a bio smart card reader, smart card (one per user), OS license, and scientific license software if needed. The final configuration will depend on the WP2 results and on the final Eurostat RDC infrastructure. We will need at least 2 thin client per NSI (one for the researcher, one for the data-manager). All the thin client PC will be configured in one place for easy deployment and for security reasons.

WP5: Communication and Dissemination to ESS

This work package is very important, especially because there are also other projects and groups of interest dealing with the task of providing and expanding access to European datasets via remote access. Main objective of this WP is to participate on the particular meetings and workshops to exchange knowledge and give feedback to the relevant circle.
Next to the communication and dissemination of the results to the non-participating NSIs in the ESS it is also necessary to promote the results to the research community. They need to know that there is a way of comfortable access to community statistics by now. It is imaginable to organize an European research award for social and economic relevant results based on European Statistics.

WP6: Management

The project team is a cooperation of several MS whereas each MS is responsible for its own work package where also other partners will participate, depending on capability and resources. The group will be coordinated by Destatis. This includes the coordination and organisation of contacts and contracts between the partners.