AI and Privacy: How Opt-Out Works in Generative Models and What Protections Exist for Users and Companies

Every interaction with a chatbot or a generative search feature involves the exchange of personal data and content that platforms may use to improve their models. Most generative artificial intelligence services currently adopt an opt-out model: data is used for training unless the user actively disables it. This applies to conversations, images, and published texts, with implications for user privacy, corporate security, and the digital information ecosystem.

Privacy risks in the era of generative models

IBM has classified the main risk categories associated with AI use: collection of large volumes of sensitive data, acquisition without fully informed consent, reuse for purposes other than those declared, distortions linked to automated surveillance systems, exfiltration through techniques such as prompt injection, and accidental leaks of confidential information. Reported real-world cases include medical photos inserted into training datasets without explicit authorization from the individual concerned, and silent changes to sharing settings on professional platforms that expose data to AI model training.

The European GDPR establishes principles of privacy by design and by default, aimed at limiting personal data processing to what is strictly necessary and requiring informed consent. In practice, however, major AI services often rely on other legal bases, such as the controller’s legitimate interest, enabling data use for training by default and placing the burden on the user to object afterwards. This configuration creates tension between regulation and the actual functioning of platforms.

Anthropic and Claude: training and chat retention

Starting September 28, 2025, conversations and coding sessions on Claude may be used to train AI models and stored for up to five years. The setting related to data use for training is enabled by default; users can disable it by going to Settings → Privacy → “Help improve Claude.” The change applies only to future data and does not retroactively affect already archived conversations. This mechanism presents three operational limits: the need for active user intervention, the absence of retroactive effects, and a multi‑year retention period.

Meta AI: objection to historical processing

Meta has announced its intention to use content published by adult users on Facebook and Instagram — posts, photos, comments, and captions — as well as interactions with Meta AI on WhatsApp, for the purpose of training its models. The Italian Data Protection Authority has reminded users that they may exercise their right to object under Art. 21 GDPR through separate forms for Facebook users, Instagram users, and non‑users. Data belonging to users under 18 is automatically excluded from training, except when content concerning them is published by adults.

ChatGPT: five categories of data to handle with caution

According to an analysis by the Wall Street Journal, reported by Milano Finanza, users should avoid entering at least five types of information into general‑purpose chatbots:

Personal identifying data: tax codes, identity documents, driver’s licenses, passports, birth dates, addresses, and phone numbers.
Medical results: exams, reports, and diagnoses, which do not benefit from the protections granted to health data processed in professional settings.
Financial accounts: bank account numbers, credit cards, and investment details.
Proprietary business information: trade secrets, customer data, non‑public source code, and internal strategies.
Access credentials: passwords, PINs, security questions, and one‑time codes.

Beyond these five categories, other sources note that it is wise to avoid sharing details that identify third parties without their consent, as such data also ends up in the platform’s stored histories.

Conversations may be automatically analyzed to detect policy violations; in cases flagged for safety, they may be reviewed by internal staff or external providers for verification and system maintenance. European users can request from OpenAI a full report of the processing carried out through the dedicated privacy portal, exercising their rights of access and, within limits, deletion.

What to do today: practical controls for users and companies

For individual users

There are three levels of action:

Training settings: disable the use of conversations or public content for training on ChatGPT, Claude, and Meta, where available.
History management: regularly delete chats and attachments, reducing the amount of data immediately accessible from the interface.
Dedicated plans: consider business or enterprise subscriptions that offer stricter policies on data retention and training use.

For companies and professionals

It is recommended to map the AI tools actually in use (including those not formally authorized, shadow IT), verify the privacy and training settings of each service, draft internal policies prohibiting the insertion of sensitive data into public chatbots, and prefer controlled environments or dedicated solutions for processing strategic data.

The European AI Act introduces a risk‑based regulatory architecture, with transparency and documentation obligations for high‑impact systems; full implementation will unfold over the coming years. The GDPR grants data subjects rights of access, rectification, deletion, objection, and portability. Their effectiveness depends on active user engagement and regulatory oversight. In Italy, the 2023 decision by the Data Protection Authority on ChatGPT demonstrated the possibility of targeted regulatory interventions. The Digital Services Act (DSA) and antitrust rules (Art. 102 TFEU) also provide tools to assess the impact of generative search systems on competitive balance and information pluralism.

Google AI Overviews and traffic to editorial websites

With AI Overviews and AI Mode, Google inserts synthetic answers directly into search results pages. According to estimates from Ahrefs and Pew Research, the presence of these summaries significantly reduces click‑through rates to original sources. The Italian Federation of Newspaper Publishers (FIEG) has filed a complaint with AGCOM, which referred the matter to the European Commission under the DSA. Meanwhile, the European Publishers Council has filed an antitrust complaint. The Reuters Institute has also estimated that search engine referrals could drop by 43% over the next three years.

Publishers face an operational dilemma: current technical tools to block the use of content in AI Overviews (e.g., via robots.txt) often result in loss of visibility in traditional search results as well. In this scenario, users and publishers operate in a context where dominant platforms determine the conditions of access to and use of data and content.

Between innovation and safeguards: the speed of regulation

The gap between the pace of generative model releases and the application of regulations remains a key variable. Legal tools exist, but their effectiveness requires informed user engagement and constant oversight by authorities. For companies and professionals, risk mapping, employee training, and the choice of controlled environments represent the most concrete levers to maintain governance over their own and their customers’ data.

Sources: