A new report from Incogni evaluates the data privacy practices of today’s most widely used AI platforms. As generative AI and large language models (LLMs) become deeply embedded in everyday tools and services, the risk of unauthorized data collection and sharing has surged. Incogni’s researchers analyzed nine leading platforms using 11 criteria to understand which systems offer the most privacy-friendly experience. Their findings reveal significant gaps between transparency, data control, and user protection across the industry.
Why privacy in Gen AI is a growing concernWhile Gen AI platforms offer clear productivity benefits, they often expose users to complex data privacy risks that are hard to detect. These risks stem from two sources: the data used to train the models and the personal information exposed during user interactions. Most platforms do not clearly communicate what data is collected, how it is used, or whether users can opt out.
With LLMs being deployed in products for content creation, search, code generation, and digital assistants, users frequently share sensitive information without realizing it may be retained or used to train future models. Incogni’s report addresses this gap by offering a standardized framework to score and rank AI platforms according to their privacy practices.
According to Incogni’s ranking, Le Chat (Mistral AI) is the least invasive AI platform in terms of data privacy. It limits data collection and performed well across most of the 11 measured criteria. ChatGPT (OpenAI) ranked second, followed by Grok (xAI). These platforms offer relatively clear privacy policies and provide users with a way to opt out of having their data used in model training.
At the bottom of the ranking are Meta AI, Gemini (Google), and Copilot (Microsoft). These platforms were found to be the most aggressive in data collection and least transparent about their practices. DeepSeek also performed poorly, particularly in the ability to opt out of model training and in vague policy language.
Training data practicesThe report delves into several key questions regarding how user data is utilized for model training.
Are prompts used to train the models?Incogni found that some platforms explicitly allow users to opt out of training: ChatGPT, Copilot, Le Chat, and Grok fall into this group. Others, such as Gemini, DeepSeek, Pi AI, and Meta AI, do not appear to provide a way to opt out. Claude (Anthropic) was the only platform that claims to never use user inputs for training.
Are prompts shared with external parties?Most platforms share prompts with a defined set of third parties, including service providers, legal authorities, and affiliated companies. However, Microsoft and Meta allow sharing with advertisers or affiliates under broader terms. Anthropic and Meta also disclose sharing with research collaborators. These policies raise questions about the limits of data control once prompts leave the platform.
What kind of training data is used?All platforms train their models on publicly accessible data. Many also use user feedback or prompts to improve performance. OpenAI, Meta, and Anthropic provided the most detailed explanations about training data sources, although even these were limited in scope. No platform offered a way for users to remove their personal data from existing training sets.
Transparency scoresBeyond the policies themselves, Incogni also evaluated how transparent platforms are about their data practices.
How clearly do platforms explain prompt usage?OpenAI, Mistral, Anthropic, and xAI made it easy to determine how prompts are used for training. These platforms offered searchable support content or detailed FAQ sections. Meta and Microsoft, on the other hand, required users to search through unrelated documentation. DeepSeek, Pi AI, and Google’s Gemini offered the least clarity.
Can users find information about model training?Platforms were grouped into three levels of transparency. OpenAI, Mistral, Anthropic, and xAI provided accessible documentation. Microsoft and Meta made this information somewhat difficult to find. Gemini, DeepSeek, and Inflection offered limited or fragmented disclosures, requiring users to parse multiple documents to get answers.
Are privacy policies readable?Incogni used the Dale-Chall formula to assess readability. All policies required at least a college-level reading ability. Meta, Microsoft, and Google provided long and complex privacy documents that covered multiple products. Inflection and DeepSeek offered very short policies that lacked clarity and depth. OpenAI and xAI were noted for offering helpful support articles, though these must be maintained over time to remain accurate.
Data collection and sharing practicesThe investigation also uncovered details about what specific data is collected and with whom it might be shared.
What data can be shared with third parties?Meta and DeepSeek share personal information across corporate entities. Meta and Anthropic share information with research partners. In several cases, vague terms like “affiliates” were used, making it unclear who exactly receives user data. Microsoft’s policy also permits sharing with advertisers under specific conditions.
Where does user data come from?Most platforms collect data during account setup or user interaction. However, Incogni found evidence that some platforms also gather data from additional sources:
Pi AI appears to use the fewest external sources, focusing mainly on direct input and public data. Microsoft stated that it may use data from brokers as well.
Mobile app data collection and sharingIncogni also examined how iOS and Android apps collect and share user data. Le Chat had the lowest privacy risk, followed by Pi AI and ChatGPT. Meta AI was the most aggressive, collecting data like usernames, emails, phone numbers, and sharing much of it with third parties.
Gemini and Meta AI collect exact user locations. Pi AI, Gemini, and DeepSeek collect phone numbers. Grok shares photos and app interaction data, while Claude shares app usage and email addresses.
Interestingly, Microsoft’s Copilot Android app claimed not to collect or share any user data. Because this was inconsistent with its iOS app disclosures, Incogni scored both apps based on the iOS version.
Privacy risks vary widely between Gen AI platforms. The best performers offered clear privacy policies, opt-out controls, and minimal data collection. The worst offenders lacked transparency and shared user data broadly without clear justification.
Incogni concludes that AI platforms must make privacy documentation easier to read, provide modular privacy policies for each product, and avoid relying on broad umbrella policies. Companies should also maintain up-to-date support resources that clearly answer data handling questions in plain language.
All Rights Reserved. Copyright , Central Coast Communications, Inc.