The revelation that leading AI models, including those developed outside China, reflect Chinese state narratives and censorship ideals has sparked renewed debate about the influence of training data and the ethical responsibilities of tech giants. A new report from the American Security Project (ASP), a think tank reputed for its bipartisan stance on US national security and technology policy, asserts that several top artificial intelligence platforms parrot Chinese Communist Party (CCP) propaganda and suppress viewpoints on topics the Chinese government considers sensitive. With generative AI rapidly becoming intertwined with information access and public discourse, these findings raise critical questions about the global flow of ideas, the technical limitations of large language models (LLMs), and the practical measures necessary to protect informational integrity in AI systems.
ASP’s research scrutinized five widely-used LLM chatbots: OpenAI’s ChatGPT, Microsoft’s Copilot, Google’s Gemini, DeepSeek’s DeepSeek-R1 (originating from China), and X’s Grok. While only one—DeepSeek—was Chinese in origin, the investigation revealed that all five sometimes echoed CCP perspectives, regardless of their country of development. The study methodology involved prompting each system, in both English and Simplified Chinese, on topics such as the Tiananmen Square massacre, ethnic unrest, and official state narratives—areas frequently censored inside China.
The responses were telling. On the subject of June 4, 1989—the date of the brutal Tiananmen Square crackdown—the majority of the models employed various degrees of euphemism or passive phrasing. For instance, only X’s Grok explicitly described the events as a “killing of unarmed civilians” by the military, while ChatGPT, when prompted in Chinese, used the term “massacre.” In contrast, DeepSeek and Copilot referred to “The June 4th Incident,” aligning closely with Beijing’s preferred terminology. Across the board, phrases like “suppression of protests” or “crackdown” replaced more direct descriptors, a subtle but significant linguistic accommodation of official Chinese narratives.
A notable nuance emerged: while US-based Grok was the most forthright in critiquing the Chinese state, Microsoft’s Copilot tended to reinforce official CCP talking points, sometimes elevating them to the status of “true information”—an outcome that the report’s authors argue carries serious implications for the model’s credibility and global influence.
A critical element in the study’s rigor is the use of minimal prompts, intentionally eschewing detailed user instructions, to gauge the models’ default response profiles. According to Courtney Manning, the report’s principal author, this approach mirrors real-world user behavior and helps isolate underlying systemic biases from the influences of prompt engineering.
Such patterns are likely a byproduct of the opaque and often indiscriminate scraping strategies employed in the construction of foundational AI datasets. Experts note that the sheer scale of web-crawling and the relative uniformity of prominent voices in censored environments mean that authoritative Chinese material—state-run news, official pronouncements, legally-mandated search engine results—appears disproportionately in these massive training sets.
This has profound ramifications. When a model is as likely to cite propaganda or misinformation as it is to reference a carefully researched academic source, the distinction between disinformation and consensus reality blurs. The problem intensifies in areas where international consensus is absent or actively contested—be it political, scientific, or cultural issues—and where local censorship skews the available information. Since LLMs cannot independently verify which narratives are globally agreed upon as factual, they become vectors for whatever data they “see” most often.
As a result, the report warns, AI-driven censorship or unwitting parroting of authoritarian viewpoints is only a part of the risk landscape. Just as dangerous is the elevation of state-sponsored disinformation to the same epistemological status as verifiable fact.
Microsoft’s Copilot, in particular, is called out in the ASP study as a frequent amplifier of CCP-aligned talking points, more so than its competitors. While Microsoft and other vendors often tout their internal review processes and collaborative work with external ethics boards, recent events suggest that the pace and scope of intervention may be insufficient to address fast-evolving forms of algorithmic bias.
One systemic issue is the lack of transparency around both training data composition and the specifics of content moderation algorithms. With the notable exception of entirely open-source models—wherein researchers can at least inspect the constituent datasets—most commercial LLMs operate as black boxes. This opacity is exacerbated by a lack of regulatory standards for documenting or mitigating the influence of state-sponsored disinformation in training material.
This is not merely a hypothetical concern. Academic research published in preprints and peer-reviewed venues has converged on the view that “true political neutrality” in AI is likely unattainable, given the compositional biases inherent in data collection, algorithmic design, and user interaction. Thus, international competition over information sovereignty has a new front: the “statistical majority vote” of web data that determines what models learn.
Moreover, the ASP report’s findings are part of a broader pattern of Chinese state efforts to shape global information through both overt and covert means. LLMs are simply the newest, most scalable means yet for the spread of these viewpoints, but the same mechanisms could be exploited by other regimes or interest groups. Information pollution—wherein large language models recirculate harmful, misleading, or authoritarian content—poses risks not only to individual users’ worldviews but also to the cognitive security of entire societies.
Beyond supply-side interventions, the report calls for greater public education around the epistemic limitations of AI systems. Users, the authors argue, should approach LLM outputs with healthy skepticism—understanding that fluency and plausibility do not equate to factuality, especially on topics mired in international controversy.
The industry, meanwhile, must reckon with the uncomfortable proposition that “maximum context” and information pluralism are not necessarily neutral or safe. In the absence of a “true barometer of truth”—a tool notoriously difficult to engineer in AI—the fallback is a combination of transparency, ongoing monitoring, and clear user disclosures about the known weaknesses and risks associated with current-generation LLMs.
A parallel issue—though less discussed—is the inadvertent silencing or muddling of minority perspectives whose linguistic, historical, or political nuances are harder to scrape at scale. The algorithms’ “middle-of-the-road” answer formulation may not only privilege dominant state voices but also marginalize non-mainstream views by default. This effect could be particularly pronounced in multi-lingual settings where official language diverges sharply from local vernacular or diasporic expression.
Nonetheless, the critical weakness exposed by the ASP report lies in the failure of leading AI platforms to implement, or even meaningfully attempt, robust filtration and contextualization mechanisms for content generated in globally sensitive regions or subjects. While some platforms have implemented geofencing or query-based restrictions (usually at the behest of national laws), these do not address the underlying problem: the homogenization of LLMs’ worldview by virtue of their data diet.
On the positive side, the diversity of model design philosophies—ranging from highly-regulated, risk-averse approaches like those of OpenAI, to comparatively unconstrained platforms such as Grok—suggests an ongoing debate within the AI ecosystem about the best balance between openness and safety. Yet this variance underscores the urgent need for shared metrics, independent audits, and possibly even regulatory mandates for transparency and accountability in large-scale AI model deployment.
Going forward, the tech industry, governments, and civil society must work in concert to ensure that the next generation of AI not only wields linguistic fluency but also evidences a disciplined respect for informational integrity. This will demand greater transparency in model training, more scrupulous curation of foundational data, and continuous public scrutiny. Only then can AI fulfill its promise as a tool for enlightenment rather than an unwitting mouthpiece for power—wherever that power may reside.
Source: theregister.com Top AI models parrot Chinese propaganda, report finds
Examining the Findings: Evidence of Systemic Bias
ASP’s research scrutinized five widely-used LLM chatbots: OpenAI’s ChatGPT, Microsoft’s Copilot, Google’s Gemini, DeepSeek’s DeepSeek-R1 (originating from China), and X’s Grok. While only one—DeepSeek—was Chinese in origin, the investigation revealed that all five sometimes echoed CCP perspectives, regardless of their country of development. The study methodology involved prompting each system, in both English and Simplified Chinese, on topics such as the Tiananmen Square massacre, ethnic unrest, and official state narratives—areas frequently censored inside China.The responses were telling. On the subject of June 4, 1989—the date of the brutal Tiananmen Square crackdown—the majority of the models employed various degrees of euphemism or passive phrasing. For instance, only X’s Grok explicitly described the events as a “killing of unarmed civilians” by the military, while ChatGPT, when prompted in Chinese, used the term “massacre.” In contrast, DeepSeek and Copilot referred to “The June 4th Incident,” aligning closely with Beijing’s preferred terminology. Across the board, phrases like “suppression of protests” or “crackdown” replaced more direct descriptors, a subtle but significant linguistic accommodation of official Chinese narratives.
A notable nuance emerged: while US-based Grok was the most forthright in critiquing the Chinese state, Microsoft’s Copilot tended to reinforce official CCP talking points, sometimes elevating them to the status of “true information”—an outcome that the report’s authors argue carries serious implications for the model’s credibility and global influence.
Methodological Rigor or the Limits of Current Metrics?
Quantifying AI bias is complicated. ASP pinned its “popularity” rankings to user estimates drawn from multiple sources—800 million for ChatGPT, 350 million for Gemini, 96 million for DeepSeek, 33 million for Copilot, and 25 million for Grok—though the lack of transparent, auditable metrics for LLM usage warrants caution. The response testing utilized VPNs and private sessions from three US cities to reduce the risk of regional bias or geo-fencing influencing results.A critical element in the study’s rigor is the use of minimal prompts, intentionally eschewing detailed user instructions, to gauge the models’ default response profiles. According to Courtney Manning, the report’s principal author, this approach mirrors real-world user behavior and helps isolate underlying systemic biases from the influences of prompt engineering.
Deeper Roots: How AI Training Data Absorbs National Perspectives
One of the report’s salient arguments is that the vast corpora used to train AI models are riddled with regionally dominant narratives—among them, Chinese state propaganda. This is especially stark when the LLMs are exposed to Chinese-language prompts, as the training data frequently includes verbatim or near-verbatim “officialese” from CCP sources. Manning highlights the mirrored use of characters and phrasing between official Chinese communiques and the responses generated by certain English-language models, further evidence that the information landscape shaping AI is highly permeable to foreign narratives.Such patterns are likely a byproduct of the opaque and often indiscriminate scraping strategies employed in the construction of foundational AI datasets. Experts note that the sheer scale of web-crawling and the relative uniformity of prominent voices in censored environments mean that authoritative Chinese material—state-run news, official pronouncements, legally-mandated search engine results—appears disproportionately in these massive training sets.
The Dilemma of “Truth” in AI Systems
The ASP report underscores a philosophical and pragmatic limitation: LLMs do not, and arguably cannot, understand “truth” in the way humans do. Instead, as Manning reiterates, these models assemble responses based on the statistical probability of word sequences in their training data—a brute-force mimicry of language rather than a reasoned synthesis of fact.This has profound ramifications. When a model is as likely to cite propaganda or misinformation as it is to reference a carefully researched academic source, the distinction between disinformation and consensus reality blurs. The problem intensifies in areas where international consensus is absent or actively contested—be it political, scientific, or cultural issues—and where local censorship skews the available information. Since LLMs cannot independently verify which narratives are globally agreed upon as factual, they become vectors for whatever data they “see” most often.
As a result, the report warns, AI-driven censorship or unwitting parroting of authoritarian viewpoints is only a part of the risk landscape. Just as dangerous is the elevation of state-sponsored disinformation to the same epistemological status as verifiable fact.
Are Industry Responses Adequate?
The AI industry has long struggled with questions of bias, fairness, and neutrality. Leading laboratories have implemented post-training “alignment” steps, where human feedback guides the model away from overtly problematic responses. However, as the ASP report and independent academic experts have observed, alignment is an imperfect solution—it is reactive, not preventive, and can never fully excise the traces of problematic data already present in the underlying model.Microsoft’s Copilot, in particular, is called out in the ASP study as a frequent amplifier of CCP-aligned talking points, more so than its competitors. While Microsoft and other vendors often tout their internal review processes and collaborative work with external ethics boards, recent events suggest that the pace and scope of intervention may be insufficient to address fast-evolving forms of algorithmic bias.
One systemic issue is the lack of transparency around both training data composition and the specifics of content moderation algorithms. With the notable exception of entirely open-source models—wherein researchers can at least inspect the constituent datasets—most commercial LLMs operate as black boxes. This opacity is exacerbated by a lack of regulatory standards for documenting or mitigating the influence of state-sponsored disinformation in training material.
Implications for Global Public Opinion and Information Security
The prevalence of Chinese state messaging in Western-deployed, non-Chinese AI models should be a warning sign, not only for the tech industry but also for those concerned with the resilience of democratic discourse. As reliance on AI expands for everything from news aggregation to governmental decision support, the ability of authoritarian regimes to seed their narratives globally—through the backdoor of algorithmic consensus—magnifies.This is not merely a hypothetical concern. Academic research published in preprints and peer-reviewed venues has converged on the view that “true political neutrality” in AI is likely unattainable, given the compositional biases inherent in data collection, algorithmic design, and user interaction. Thus, international competition over information sovereignty has a new front: the “statistical majority vote” of web data that determines what models learn.
Moreover, the ASP report’s findings are part of a broader pattern of Chinese state efforts to shape global information through both overt and covert means. LLMs are simply the newest, most scalable means yet for the spread of these viewpoints, but the same mechanisms could be exploited by other regimes or interest groups. Information pollution—wherein large language models recirculate harmful, misleading, or authoritarian content—poses risks not only to individual users’ worldviews but also to the cognitive security of entire societies.
Recommendations and the Path Forward
To mitigate these risks, the ASP report advocates for a more disciplined approach to AI training data curation. Rather than relying on indiscriminate scraping and retroactive “alignment,” researchers and developers should filter out known sources of propaganda at the earliest stages, and maintain detailed documentation of dataset provenance. This shift would require investments in multilingual fact-checking, partnerships with global civil society organizations, and, possibly, government regulation to ensure minimum standards for data hygiene.Beyond supply-side interventions, the report calls for greater public education around the epistemic limitations of AI systems. Users, the authors argue, should approach LLM outputs with healthy skepticism—understanding that fluency and plausibility do not equate to factuality, especially on topics mired in international controversy.
The industry, meanwhile, must reckon with the uncomfortable proposition that “maximum context” and information pluralism are not necessarily neutral or safe. In the absence of a “true barometer of truth”—a tool notoriously difficult to engineer in AI—the fallback is a combination of transparency, ongoing monitoring, and clear user disclosures about the known weaknesses and risks associated with current-generation LLMs.
Critical Analysis: The Challenges of Balancing Openness, Utility, and Security
At a deeper level, the ability of AI models to reflect or amplify authoritarian narratives should be understood as a predictable artifact of their design rather than a sign of malicious intent by Western developers. Language models are, above all, pattern detectors, and when their corpora are shot through with dominant governmental or corporate messages, these patterns become their “facts.” In this light, the overlap with Chinese state language is a symptom of a broader vulnerability: AI’s blurred distinction between frequency and veracity.A parallel issue—though less discussed—is the inadvertent silencing or muddling of minority perspectives whose linguistic, historical, or political nuances are harder to scrape at scale. The algorithms’ “middle-of-the-road” answer formulation may not only privilege dominant state voices but also marginalize non-mainstream views by default. This effect could be particularly pronounced in multi-lingual settings where official language diverges sharply from local vernacular or diasporic expression.
Nonetheless, the critical weakness exposed by the ASP report lies in the failure of leading AI platforms to implement, or even meaningfully attempt, robust filtration and contextualization mechanisms for content generated in globally sensitive regions or subjects. While some platforms have implemented geofencing or query-based restrictions (usually at the behest of national laws), these do not address the underlying problem: the homogenization of LLMs’ worldview by virtue of their data diet.
On the positive side, the diversity of model design philosophies—ranging from highly-regulated, risk-averse approaches like those of OpenAI, to comparatively unconstrained platforms such as Grok—suggests an ongoing debate within the AI ecosystem about the best balance between openness and safety. Yet this variance underscores the urgent need for shared metrics, independent audits, and possibly even regulatory mandates for transparency and accountability in large-scale AI model deployment.
Conclusion: Navigating a New Age of Algorithmic Influence
The ASP report’s findings are both a warning and a call to action. As generative AI systems become the mediators of global knowledge, the patterns of bias, censorship, and informational slippage they encode acquire outsized consequences. The evidence that American, as well as Chinese, AI models can internalize and repeat state narratives at odds with open-society values is a reminder that technological prowess alone is no guarantee of epistemic resilience.Going forward, the tech industry, governments, and civil society must work in concert to ensure that the next generation of AI not only wields linguistic fluency but also evidences a disciplined respect for informational integrity. This will demand greater transparency in model training, more scrupulous curation of foundational data, and continuous public scrutiny. Only then can AI fulfill its promise as a tool for enlightenment rather than an unwitting mouthpiece for power—wherever that power may reside.
Source: theregister.com Top AI models parrot Chinese propaganda, report finds