OpenAI ordered to stop processing data in Italy over GDPR concerns

OpenAI ordered to stop processing data in Italy over GDPR concerns

OpenAI has been ordered by Italy’s data protection authority to stop processing people’s data locally over concerns that the company’s ChatGPT maker is breaching the European Union’s General Data Protection Regulation (GDPR). The order was issued by Garante, which is investigating claims that OpenAI has unlawfully processed people’s data and lacked a system to prevent minors from accessing the tech. Since OpenAI does not have a legal entity established in the EU, any data protection authority is empowered to intervene if it sees risks to local users.

The GDPR applies whenever EU users’ personal data is processed. OpenAI’s large language model has been shown to crunch this kind of information, producing biographies of named individuals in the region on-demand. Although OpenAI declined to provide details of the training data used for the latest iteration of the technology, GPT-4, it has disclosed that earlier models were trained on data scraped from the Internet, including forums such as Reddit. This potentially raises further GDPR concerns, since the regulation provides Europeans with a suite of rights over their data, including the right to rectification of errors.

The Garante’s statement also highlights a data breach the service suffered earlier this month when OpenAI admitted a conversation history feature had been leaking users’ chats and may have exposed some users’ payment information. Data breaches are another area the GDPR regulates with a focus on ensuring entities that process personal data are adequately protecting the information. The pan-EU law also requires companies to notify relevant supervisory authorities of significant breaches within tight time-periods.

Overarching all this is the big question of what legal basis OpenAI has relied upon for processing Europeans’ data in the first place. In other words, the lawfulness of this processing. The GDPR allows for a number of possibilities, from consent to public interest, but the scale of processing to train these large language models complicates the question of legality. OpenAI does not appear to have informed people whose data it has repurposed to train its commercial AIs. If OpenAI has processed Europeans’ data unlawfully, DPAs across the bloc could order the data to be deleted, although whether that would force the company to retrain models trained on data unlawfully obtained is one open question as an existing law grapples with cutting edge tech.

The Italian DPA is concerned about the risk of minors’ data being processed by OpenAI since the company is not actively preventing people under the age of 13 from signing up to use the chatbot, such as by applying age verification technology. Risks to children’s data is an area where the regulator has been very active, recently ordering a similar ban on the virtual friendship AI chatbot, Replika, over child safety concerns. In recent years, it has also pursued TikTok over underage usage, forcing the company to purge over half-a-million accounts it could not confirm did not belong to kids.

OpenAI has 20 days to respond to the order, backed up by the threat of penalties if it fails to comply. If OpenAI can’t definitively confirm the age of any users it’s signed up in Italy, it could be forced to delete their accounts and start again with a more robust sign-up process.