The EU plans to become the most attractive, secure and dynamic data-agile economy in the world. The Commission’s new digital strategy includes an ambition for the EU to seize new opportunities in digitised industry and business-to-business artificial intelligence (AI) applications. However, the vital question of whether GDPR is an obstacle to the EU’s plans to become an AI hub has been scrupulously avoided by the Commission.
The European Commission announced its new EU data strategy with the publication of two papers in February 2020, a white paper on AI and a communication setting out a “European strategy for data”. The Commission admits that “the availability of data is essential for training artificial intelligence systems … without data, there is no AI.” Yet, because GDPR restricts the uses of personal data, GDPR is already hampering the development of AI in the EU.
The Commission wants businesses to have “easy access to an almost infinite amount of high-quality industrial data.” To facilitate access to such datasets, the creation of European data spaces for sectors including agriculture, industry, and health is proposed.
The Commission presents the GDPR as an important EU achievement that will help trustworthy AI to flourish. However, the Commission avoids asking whether the GDPR is actually an obstacle to AI innovation.
GDPR represents the gold standard for data protection worldwide. There is little political appetite in Europe to roll-back GDPR. Yet GDPR clearly does create friction with machine learning. For example, several of its core principles, including purpose limitation and data minimisation, restrict the creation of large datasets.
More flexible GDPR concepts exist which permit innovation, such as legitimate interest, public interest and scientific research. Yet these are vaguely defined and can also be interpreted differently by the member states’ various national data protection authorities. How can the Commission reduce the friction between GDPR and AI innovation?
The first step is recognising that friction exists. Currently, EU policy makers are in denial about this reality. When challenged, they generally just repeat the mantra that GDPR supports sustainable AI innovation.
While GDPR has in-built flexibility, that flexibility is surrounded by legal uncertainty which makes it hard to avail of in practice. Only by candidly admitting that GDPR creates barriers can the Commission begin to find ways to reduce those barriers – while preserving the GDPR’s essential protections. The Commission could, for example, provide innovation-friendly interpretations of GDPR’s vaguer provisions. The Commission has done this in the past, in other areas where regulation was hampering development.
The Treaty on the Functioning of the European Union actually requires the Union and member states to create the conditions to encourage the competitiveness of EU industry – including by promoting innovation and technological development. GDPR should be therefore interpreted in light of these overarching objectives.
GDPR has at least three zones of uncertainty which the Commission could usefully clarify:
- The definition of personal data and how “special categories” can be used
There is still uncertainty regarding what constitutes personal data under GDPR, such as about whether anonymised datasets are personal data or not. This uncertainty means that machine learning innovators prefer to use public databases in the US to train their algorithms.
The Commission could create a roadmap which creates large data spaces for training that include personal data. Large datasets of personal data may help to avoid bias and discrimination.
Currently, most facial recognition algorithms are trained on non-European datasets and tested for racial discrimination at a US agency, the NIST. This is because GDPR is seen as discouraging such testing in Europe. The Commission could examine ways for GDPR to facilitate the creation of representative databases and testing mechanisms.
- What does processing for scientific research and statistical purposes include?
GDPR contains special exemptions on data use for scientific research and statistical purposes, but their definitions are ambiguous. Recital 159 GDPR gives a non-exhaustive list of scientific research activities, but some believe commercial research should be excluded.
The Commission should clarify how GDPR’s scientific research and statistics can be used to facilitate innovation in machine learning.
- What are “appropriate safeguards” for scientific research and statistical studies?
GDPR highlights the importance of data processing for scientific research and statistical purposes, subject to “appropriate safeguards for the rights and freedoms of the data subject”.
GDPR allows member states to derogate when processing personal data for such purposes. For example, the French authorities have specified particular safeguards for processing health data for medical research. However, as this only applies in France, cross-border medical research is hampered.
The Commission needs to work urgently to ease the friction between the GDPR and AI innovation. The official line that no friction exists is an unhelpful denial of reality. Unless this reality is faced, GDPR will continue to hamper the EU’s ambitious plans to become a global hub for AI.