ARTICLE
19 January 2024

EU Data Hub To Provide Access To Synthetic Supervisory Datasets As Early As 2024

SA
Schoenherr Attorneys at Law

Contributor

We are a full-service law firm with a footprint in Central and Eastern Europe providing local and international companies stellar advice. As the go-to legal advisor for complex commercial matters in the region, Schoenherr aims to use its proximity to industry leaders, in developing practical solutions for future challenges. We keep a close eye on trends and developments, which enables us to provide high quality legal advice that is straight to the point.
In September 2020, the European Commission adopted the Digital Finance Strategy to support innovation in the European financial sector and build a single market...
European Union Finance and Banking

In September 2020, the European Commission adopted the Digital Finance Strategy to support innovation in the European financial sector and build a single market for digital financial services. The EU Digital Finance Platform, a collaborative space that connects innovative financial firms and national supervisors and that also features the new Data Hub, amongst others, is part of this effort.1

Data Hub: In fall 2023, the Data Hub was added to the platform. This project, which will complement national innovation hubs and regulatory sandboxes, as well as private-sector initiatives, is certainly a novelty. For the first time, innovative firms will be able to access supervisory data for testing new applications or training artificial intelligence (AI) and machine learning (ML) models.

But given the EU's strict data privacy requirements, how can public sector data be shared with innovators? To ensure compliance with EU privacy requirements, the Data Hub will host synthetic data sets and thus rely on data synthetisation. But...

...what is data synthetisation? Synthetic data generation is a technique to create artificial ("new") data that closely resemble original data, but without exposing sensitive or confidential information. It serves as a substitute for actual data, allowing firms to experiment, test use cases, develop algorithms and perform analyses while keeping data safe and private. Synthetic data generation ensures full anonymisation while preserving the characteristics of the original data. Because of this, synthetic data and original data should deliver very similar results, which makes synthetic data highly relevant for testing.

For the Data Hub, this means that real data will never leave the authorities' premises and no external user will access actual data. Thus, national supervisors can participate in the project while innovators will be able to access meaningful information. Hence we would expect synthetic data to gain increased traction within AI and ML, as it helps train algorithms that require vast amounts of training data, which can be expensive or come with usage restrictions.

Outlook: The Directorate-General for Financial Stability, Financial Services and Capital Markets Union (DG FISMA), the EU Commission's responsible directorate for this project, is engaging in an intense dialogue with European supervisors to bring as many as possible into this initiative. Following a successful synthetic data pilot with the Bank of Spain, the first data sets are expected to become available as early as the beginning of 2024.2 While the exact types of data that will be available is not yet public, you can expect them to be relevant, as the industry was consulted earlier this year on potential use cases and the type of datasets they would like to access for testing.

Footnote

1. digital-finance-platform.ec.europa.eu.

2. digital-finance-platform.ec.europa.eu.

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.

Mondaq uses cookies on this website. By using our website you agree to our use of cookies as set out in our Privacy Policy.

Learn More