The escalation of generative AI technologies over the past few years has revolutionized how individuals and organizations create, automate, and interact with digital content. Yet, this surge in adoption comes at a moment of elevated concern about how personal and sensitive data is managed within these rapidly evolving platforms. Against this backdrop, Incogni’s 2025 evaluation of data privacy practices among the world’s leading AI platforms is both timely and essential, shining a spotlight on tangible differences in transparency, user control, and the risks of unchecked data collection.
Generative AI platforms now power everything from workplace productivity tools to consumer-facing digital assistants. While users benefit from unprecedented efficiencies—such as drafting content, summarizing documents, or automating complex tasks—each interaction generates a rich stream of data. Two primary privacy risks have emerged: the information used to train these AI systems, and the data users expose when engaging with them.
Despite regulatory advances like the European Union’s GDPR and California’s CCPA, many AI providers underdeliver on clarity and user empowerment. The result is a landscape where users, often unwittingly, surrender not just individual prompts, but potentially sensitive business communications, personal identifiers, and behavioral patterns, all of which can be stored, shared, or even used to train future iterations of AI models. This opacity is heightened by the fact that many privacy policies are lengthy, technically dense, and fail to offer meaningful choices—fueling anxieties about consent and control.
Cloud and AI security leaders, like Skyhigh Security and WitnessAI, are supplementing native platform features with bespoke data protection overlays, forensic activity logging, and policy-based controls for enterprise deployments. While some industry commentators argue that robust platform-native controls may suffice for well-configured, risk-aware organizations, the rapid evolution of AI—and the sheer heterogeneity of use cases—suggests a multi-layered approach will likely become the new standard.
For the millions engaging with generative AI—whether as consumers, knowledge workers, or enterprise leaders—the message is clear: vigilance and skepticism remain essential. Until every platform adopts modular, readable, and actionable privacy documentation, and until opt-out and data removal are universal rights, the privacy minefield will only grow more complex.
Practically, all users and organizations should:
Source: Dataconomy How AI platforms rank on data privacy in 2025
Why Data Privacy is the Central Issue in Generative AI Today
Generative AI platforms now power everything from workplace productivity tools to consumer-facing digital assistants. While users benefit from unprecedented efficiencies—such as drafting content, summarizing documents, or automating complex tasks—each interaction generates a rich stream of data. Two primary privacy risks have emerged: the information used to train these AI systems, and the data users expose when engaging with them.Despite regulatory advances like the European Union’s GDPR and California’s CCPA, many AI providers underdeliver on clarity and user empowerment. The result is a landscape where users, often unwittingly, surrender not just individual prompts, but potentially sensitive business communications, personal identifiers, and behavioral patterns, all of which can be stored, shared, or even used to train future iterations of AI models. This opacity is heightened by the fact that many privacy policies are lengthy, technically dense, and fail to offer meaningful choices—fueling anxieties about consent and control.
The Incogni Framework: Assessing AI Platform Data Practices
To surface reliable insights and draw actionable comparisons, Incogni’s researchers developed an 11-criteria framework covering aspects such as data collection and sharing, opt-out mechanisms, clarity of data use disclosures, mobile app privacy, and the readability of documentation. Nine leading generative AI platforms were scored, revealing a spectrum of privacy invasiveness and policy transparency.Key Criteria Examined
- User Data Collection: What information is gathered during account setup and interaction?
- Prompt Usage: Are user prompts used for model improvements or future training?
- Third-Party Sharing: Who outside the platform may receive access to user data?
- Opt-Out and Control Mechanisms: Can users prevent their inputs from being used in training?
- Transparency and Readability: How easy is it for users to understand and navigate data policies?
- Mobile App Data Practices: How do iOS and Android versions compare in data collection and disclosure?
The Most and Least Privacy-Invasive AI Platforms
Incogni’s headline findings make clear that not all AI platforms approach data privacy with equal rigor or restraint.Leaders in Privacy Protection
Le Chat (Mistral AI) was crowned the least invasive, minimizing data collection and performing strongly across nearly all privacy dimensions. OpenAI’s ChatGPT and xAI’s Grok came in next, each offering straightforward privacy policies, a degree of transparency, and options to exclude user data from model training.- ChatGPT (OpenAI): Users can opt out of having inputs used to train future models, and its support materials and FAQ sections offer clear explanations regarding data flows.
- Grok (xAI): Provides similar opt-out opportunities, approachable documentation, and dedicated privacy resources.
The Most Invasive Platforms
At the opposite end, Meta AI, Google’s Gemini, and Microsoft Copilot emerged as the most aggressive in both data collection and the opacity of their disclosures.- Meta AI: Shares personal data broadly across its ecosystem (including with advertisers and “affiliates”) and offers limited transparency regarding prompt use and third-party data access.
- Gemini (Google): Fails to provide clear opt-out mechanisms and collects expansive user data on both mobile and desktop platforms.
- Copilot (Microsoft): Matches Meta and Gemini in terms of data accumulation, and—despite some enterprise controls—its privacy documentation is long, difficult to parse, and sometimes inconsistent across app versions.
Are User Prompts Used for AI Training?
This is central to the privacy debate. According to Incogni’s assessment:- Opt-Out Available: ChatGPT, Copilot, Le Chat, and Grok permit users to opt out.
- No Clear Option: Gemini, DeepSeek, Pi AI, and Meta AI lack straightforward ways to prevent user interactions from contributing to future models.
- Opt-Out by Design: Claude never uses prompts for training.
Prompt Sharing and Third-Party Exposure
Most major platforms acknowledge sharing prompts and associated metadata with third parties, including service providers, law enforcement, and sometimes corporate affiliates. But crucial nuances, such as the scope of “affiliates” or the precise categories of data shared with advertising partners, remain murky.- Microsoft and Meta: Explicitly state that user data may be shared with advertisers under certain terms, a practice not mirrored by the higher-ranking privacy platforms.
- Anthropic and Meta: Go a step further, disclosing sharing arrangements with external research collaborators.
What Data Actually Trains AI Models?
Incogni’s investigation confirms that all leading generative AI platforms rely on a mix of publicly accessible content for foundational training. Many supplement these with user interaction data or feedback to fine-tune performance.- Transparency Standouts: OpenAI, Meta, and Anthropic provide somewhat detailed breakdowns (albeit within the limits of corporate confidentiality) regarding data sources and model refinement.
- Enduring Blind Spots: No evaluated platform allows users to retroactively remove personal data from models already trained—including data obtained prior to a user changing privacy preferences.
Transparency Scores: Can Users Actually Understand What’s Happening?
The issue isn’t just what data is collected—but whether users can easily discern the answers.- Best Performers: OpenAI, Mistral (Le Chat), Anthropic, and xAI make it relatively straightforward for users to locate information on data collection, model training, and user controls. Their privacy centers and FAQ resources are both searchable and written in plain(er) language.
- Lowest Transparency: DeepSeek, Pi AI, and Gemini require users to synthesize information from disparate or unrelated documents, significantly raising the barrier to effective privacy management.
- Middle of the Pack: Microsoft and Meta scatter key information throughout complex, multi-product privacy statements.
Are Privacy Policies Actually Readable?
Incogni applied the Dale-Chall formula to evaluate documentation for each platform, and the verdict is sobering: all privacy policies require at least college-level reading proficiency.- Meta, Microsoft, and Google: Feature long, multiproduct policies that, while technically comprehensive, border on the inscrutable for non-specialists.
- Inflection and DeepSeek: Offer succinct but overly simplistic policies, lacking sufficient explanatory depth or disclosure granularity.
- OpenAI and xAI: Strike a relatively better balance with maintainable, article-based support that demystifies key data usage concerns—but only for users willing to seek out these resources.
Mobile App Data Collection: A Further Privacy Minefield
Privacy risks multiply on mobile, where app permissions and platform rules further complicate data governance.- Le Chat: Scored the lowest (best) on privacy risk, collecting minimal data and offering clear disclosures.
- Meta AI: Collected the most invasive array of data, including usernames, location, contact information, and extensive sharing with unnamed third parties.
- Gemini and Meta AI: Notably collect precise user location.
- Copilot (Microsoft): Showed a concerning inconsistency—the Android app claimed no data collection, while the iOS version admitted to far broader data access. Incogni’s analysis defaulted to the less private (iOS) version in scoring since users cannot rely on device-specific disclosures.
Deep Dive: What Data Is Shared and With Whom?
The risks of broad data sharing are hardly theoretical. Incogni found:- Meta and DeepSeek: Share personal details across a sprawling set of internal and affiliate entities.
- Meta and Anthropic: Allow research partners to access some user data, with unclear limits.
- Vague Terms: Several providers use ill-defined language like “affiliates” or “partners” rather than specifying concrete recipients—a common but problematic tactic in the industry.
- Account registration and usage logs
- Linked services and app integrations
- External partners (e.g., marketing, security, or even financial firms)
- Commercial datasets (in cases like Anthropic)
- App-store-derived device and location data for mobile products
The Glaring Gaps: Opt-Out and Data Redress
Few platforms offer robust, accessible controls for excluding data from ongoing training, and none allow individuals to scrub existing personal data from established models. This lack of retroactive privacy control is among the most significant shortfalls in the current landscape, with potentially long-term implications for both user trust and future regulatory solvency.Industry Recommendations and Critical Analysis
Incogni’s research doesn’t just critique the state of AI privacy—it delivers pointed recommendations:- Clearer Documentation: Modular privacy policies for each AI product, not sprawling cross-product documents.
- Enhanced Readability: Documentation should be clear, concise, and written in language accessible to ordinary users—not just legal experts or technologists.
- Granular Controls: Users must be empowered not only to opt out of future data use but, ideally, to remove their information retroactively—something no major platform yet offers.
- Continuous Updates: As product features shift, so too should privacy disclosures and user-facing resources.
Strengths Across Platforms
- Le Chat, ChatGPT, Grok: Demonstrate that it is possible to balance powerful generative capabilities with user-centric privacy, at least compared to industry standards. Their willingness to provide opt-outs and clearer support documentation should be emulated elsewhere.
- Claude (Anthropic): Sets a high bar by excluding user prompts from training by default.
Systemic Weaknesses and Risks
- Opaque Data Sharing: The routine use of broad, undefined third-party categories—like “affiliates”—destabilizes user trust and increases real risk of data misuse.
- Lack of Data Removal Options: The inability to delete personal data from previously trained models poses unique privacy, compliance, and ethical challenges that no leading provider currently solves.
- Complex and Inaccessible Policies: High reading levels and fragmented policy structures block meaningful understanding, undermining informed consent.
- Mobile Data Minefields: Inconsistencies, particularly between iOS and Android disclosures, demonstrate an urgent need for unified, conservative data stewardship approaches.
The Broader Industry Context: Compliance, Regulation, and User Empowerment
Incogni’s report lands as regulatory and public scrutiny continue to escalate. Newer mandates—such as Europe’s Digital Markets Act and the forthcoming AI Act—further raise the stakes for transparency, portability, and explicit user consent. Platform providers must therefore brace for increasingly granular audits while consumers, especially in regulated industries, push for AI tools that offer real-time tracking and demonstrable privacy compliance.Cloud and AI security leaders, like Skyhigh Security and WitnessAI, are supplementing native platform features with bespoke data protection overlays, forensic activity logging, and policy-based controls for enterprise deployments. While some industry commentators argue that robust platform-native controls may suffice for well-configured, risk-aware organizations, the rapid evolution of AI—and the sheer heterogeneity of use cases—suggests a multi-layered approach will likely become the new standard.
Conclusion: A Call for Radical Transparency and User Rights
Incogni’s assessment exposes a world in which powerful AI tools are already deeply embedded in daily life—but where basic questions of transparency, consent, and control remain far from resolved. Best-in-class platforms show that user-focused privacy is not only achievable but commercially viable; the laggards demonstrate the long tail of risk when these priorities are neglected.For the millions engaging with generative AI—whether as consumers, knowledge workers, or enterprise leaders—the message is clear: vigilance and skepticism remain essential. Until every platform adopts modular, readable, and actionable privacy documentation, and until opt-out and data removal are universal rights, the privacy minefield will only grow more complex.
Practically, all users and organizations should:
- Routinely review the privacy and data sharing policies of any AI platform they use
- Take full advantage of opt-out mechanisms where available, and press for greater control where they are lacking
- Seek out platforms where transparency and user protection are prioritized, using independent audits and reports as a benchmark, not just marketing assurances
- Engage in proactive dialogue with platform providers to push for ongoing improvements in user rights and data privacy
Source: Dataconomy How AI platforms rank on data privacy in 2025