Meta will be able to use user data to train AI, a German ruling says

The reasons behind the German ruling recognising Meta’s right to use data extracted from content publicly shared by its customers have been published. Individual rights may give way to the legitimate interests of the company – by Andrea Monti – Initially published in Italian by La Repubblica-Italian Tech

The grounds for the ruling handed down by the High Court of Cologne on 23 May 2025, which recognised Meta’s right to use public content made available by its customers to train AI models, have finally been made public. The court upheld Big Tech’s argument that “there is no reasonable alternative that Meta could pursue to achieve its interests as effectively by other, less intrusive means”.

The origin of the dispute

The ruling was issued following legal action brought by a German consumer association that complained that Meta’s decision violated customers’ right to personal data protection.

Specifically, the association accused Meta of failing to demonstrate that the use of its customers’ data to train AI was necessary and appropriate under the General Data Protection Regulation (GDPR) and that the activity was prohibited because it also involved the processing of “special” personal data —i.e., for example, relating to health— without being able to invoke any of the exceptions provided for in the GDPR.

Meta defended itself by arguing that it had a “legitimate interest” in using public content circulating on its platforms that was compatible with the GDPR and that it had taken a series of measures to reduce the risks to individuals’ rights to an acceptable level.

In particular, the judgment states that Meta claimed to have limited the use of data to that made public by customers, to have provided for the possibility of changing the status of content from public to private, thereby excluding it from use, to have informed customers and given them an effective opportunity to object to the processing, to have de-identified information relating to individuals, to have “tokenised” it (i.e., reduced to mathematical values necessary to allow the model to perform calculations) and thus decoupled it from the personal identity of individuals, and that it had adopted security measures throughout the model’s development cycle.

In ruling in favour of Meta, the German court set out a series of principles that significantly redefine the widespread interpretation — including in Italy — of personal data protection legislation, affirming a series of principles that also apply outside the realm of AI.

The GDPR also protects economic interests and not just the rights of individuals

‘In addition to legal and ideological interests, economic interests are also considered legitimate interests,’ writes the Court, referring to a ruling by the Court of Justice of the European Union, which recognised the commercial interest of a sports federation in communicating the data of its athletes.

Furthermore, the ruling continues, “The anonymisation of such datasets is not feasible, as it is often difficult to obtain the consent of the data subjects with reasonable effort. This interpretation also reflects the ‘dual purpose” of the GDPR, which is not only to protect personal data but also to ensure the free movement of such data and thus its usability.’

Therefore, even though it is actually the personal data protection regulation itself that states this and there would have been no need to say so, the ruling specifies that the interests of companies have the same dignity as the rights of individuals. In other words, there is no “principle” that prevents the use of personal data in the context of economic activity. The important thing, the Court reiterates, is that this use is actually necessary and indispensable to achieve a legitimate result, even if not expressly provided for by law.

To understand the scope of this principle, one need only think of issues relating to the storage of internet traffic data and email metadata, those relating to the use of analytics, or even those arising from the “pay or okay” model — or rather “pay in cash or pay in data”. In light of this ruling, it is not true that these activities are unlawful as such, but the relationship between the “sacrifice” actually imposed on the customer and the objectives of the provider must be verified on a case-by-case basis. If the risks to the fundamental rights and freedoms of the individual are sufficiently limited in practice, a company cannot be prevented from processing the relevant personal data.

The risks to be considered are only those directly related to the functioning of the model

Another fundamental principle for the development of AI in the European Union is that, when assessing the consequences of personal data processing, only those relating to the training of the AI itself should be considered.

The judges write on this point: ‘Other possible infringements of the law that could result from the subsequent functioning of AI (such as disinformation, manipulation, other harmful practices) are not currently sufficiently foreseeable and can be prosecuted separately. In any case, it is remote that such risks would materialise to such an extent as to render the legitimate use of AI impossible and, ultimately, to call into question the adequacy of the data processing.’

The judges clearly state the principle that, in order to assess whether personal data can be used to train AI, only the direct consequences of using the data in question should be considered, and not the fact that someone might use the model in the future to commit unlawful acts. In this case, the court notes that other existing rules apply because, it is inferred, the AI model is the tool with which the laws are violated and not the author of the violation.

Total anonymisation is not necessary

Another point of contention between the parties was the de-identification through the deletion of data relating to individuals, but the retention of photos.

Meta considered it sufficient to delete data such as full names, email addresses, telephone numbers, national identification numbers, user IDs, credit/debit card numbers, bank account numbers/bank codes, vehicle registration numbers, IP addresses and postal addresses and to transform them into unstructured and “tokenised” form. On this point, it states, “Although this does not exclude the possibility that, despite de-identification, identification may still occur in individual cases, the court considers that these measures will reduce the overall risk”.

Training an AI is not processing targeted at a specific individual

Here too, it is worth quoting the judgment verbatim: ‘the development of Large Language Models based on very large datasets does not usually involve the targeted processing of personal data or the identification of a specific person‘ and further ’the prerequisites for non-targeted processing are sufficiently met by the purpose of AI training, which is intended to create general models for calculating probabilities and not for profiling individuals‘, as well as by the numerous protective measures adopted by the defendants.’

This is a key passage in the judgment because it reiterates another aspect that has hardly ever been considered in the (Italian) application of the GDPR: the regulation applies to processing that identifies or makes identifiable a specific person and not categories or groups. Therefore, given that the tokenisation of the content of posts shared on Meta’s social networks was carried out through sufficient de-identification of individuals, the processing of the data thus obtained does not violate the law.

Again, the practical consequences of this legal principle go beyond the scope of AI because, for example, they refute the argument that all profiling activities carried out using trackers, IP numbers or other tools that identify devices or software, rather than those using them, are systematically in breach of the law.

A message for the European Commission and national data protection authorities

As has been repeated several times, this process has a more general value that transcends the Meta issue because it concerns the relationship between the ideological assumptions of regulation and the industrial impact of technological development.

It is quite clear that over the course of almost ten years, the GDPR has been unilaterally interpreted to the detriment of the legitimate interests of innovators, in the name of a fetishisation of “privacy” (a term that is not even mentioned in the European regulation).

Therefore, national data protection authorities have adopted soft-law measures and provisions that have not taken into due consideration what the regulation already provided for when it was enacted: as long as one remains within the scope of the law, there are no absolute prohibitions on the processing of personal data, but rather a balancing of interests, and this balancing of interests must be verified on a case-by-case basis.

The GDPR is certainly not perfect and would need to be thoroughly rebuilt from the ground up, but this ruling shows that it can be interpreted reasonably, taking into account the rules protecting research and business.

To be clear, this is not about giving Big Tech or businesses in general a “free hand” and sacrificing individuals on the altar of profit, but neither can we do the opposite in the name of an ambiguity that has never been clarified about the role that information technology can and must play in transforming our society.

This is the point that the European Commission should consider when adopting the operational acts of the AI Regulation and identifying the amendments to the GDPR that are finally beginning to be discussed.

Leave a Reply

Your email address will not be published. Required fields are marked *