Sydney Study Finds Copilot Skews News to Global Outlets, Local Journalism at Risk

  • Thread Author
A recent University of Sydney analysis finds that Microsoft Copilot’s AI-generated news summaries frequently sideline Australian journalism, privileging US and European outlets and leaving local reporters and regional voices largely invisible in the automated news feed.

Background​

Generative AI assistants have moved from novelty to everyday tool: built into operating systems, search engines and productivity suites, they now serve as a common gateway to news for many users. That shift has prompted multiple audits and studies by news organisations and academic teams to test whether these systems preserve editorial provenance, local relevance and accuracy when summarising current events. Several coordinated audits — including journalist-led reviews across public broadcasters — have documented widespread sourcing problems, factual errors and cases where AI outputs compress or erase critical context.
The University of Sydney’s paper, Invisible journalists and dominant algorithms, led by Dr Timothy Koskie of the Centre for AI, Trust and Governance, adds a geographically specific lens: it asks whether Copilot’s news summaries surface Australian journalism and local bylines in the ways that Australians themselves expect and rely on. The short answer from Koskie’s sample: not often.

Overview of the study and headline findings​

Dr Koskie’s analysis examined hundreds of Copilot news responses generated from globally oriented prompts and found a structural skew:
  • Only about one-fifth of Copilot’s news-summary responses included links to Australian media sources.
  • When Australian outlets did appear, they were typically national heavyweights (for example, Nine or the ABC), not independent or regional outlets.
  • Local journalists were not credited in the sampled summaries; human reporters and bylines were effectively erased from the AI narrative.
  • Copilot’s responses disproportionately cited US and European outlets such as CNN and the BBC even for queries prompted by an Australian user context.
Koskie frames these findings as more than a technical quirk: they mirror, and risk accelerating, existing structural weaknesses in Australia’s media ecology—concentrated ownership, declining independent outlets, and widening news deserts in regional communities.

Why this matters: editorial provenance, traffic and the economics of journalism​

AI assistants that summarise news introduce three immediate, interlocking risks for publishers and for democratic information ecosystems:
  • Referral traffic loss. When users accept an AI summary and don’t follow through to the original article, publishers lose pageviews and ad or subscription opportunities. For outlets already under financial pressure, those lost referrals can be existential.
  • Attribution erosion. Summaries that do not show bylines or that pull quotes and facts without transparent sourcing reduce the visibility of journalists’ labor and undermine the signal that distinguishes verified reporting from noise. Audits have found repeated sourcing failures in assistants’ outputs.
  • Local trust gap. Research consistently shows people trust local news for community-level information; if assistants surface non-local outlets instead, trust and civic engagement tied to local reporting may fall. Koskie’s work underlines that AI may make the already-invisible even more invisible.
These mechanics matter in the Australian context where the revenue base for local journalism is particularly fragile: if a widely deployed assistant routes users away from local coverage and toward large international outlets, the result is a transfer of attention and revenue that compounds existing concentration pressures.

How Copilot and similar assistants reach these outcomes: a technical and product analysis​

To understand why Copilot sidelines local journalism, it helps to break the product pipeline into three interacting layers that auditors repeatedly identify:
  • Retrieval / grounding layer. This layer fetches candidate sources and documents from an index or web. If retrieval favors high-SEO or globally indexed sources over smaller regional sites, the generator receives a biased evidence set. Audits show retrieval often surfaces SEO-optimized pages with the appearance of authority even when they lack local reporting depth.
  • Generative model (LLM) layer. The model composes fluent summaries based on retrieved evidence and internal parameters. Without strict constraints, the model will prioritize concision and fluency over granular provenance, which can compress nuance and omit bylines or local place names.
  • Provenance / presentation layer. This layer controls how sources are displayed (inline citations, links, excerpting rules). When provenance is superficial or absent, users cannot assess whether the assistant relied on a local investigative report or an international wire story. Many audits flagged weak or ceremonial citation practices that fail to substantiate claims in a verifiable way.
Product incentives exacerbate the technical tendencies. Assistants are optimized for helpfulness and conversational flow; they are typically penalised less for omitting citations or local nuance than for failing to appear informative. That combination favors summary fluency and global sources with large crawl footprints over smaller, more authoritative local outlets.

Evidence from audits: reliability problems and the scope of errors​

Independent reviews led by public broadcasters and newsrooms (including multi‑broadcaster audits) have repeatedly documented substantial failings when assistants summarise news:
  • A coordinated EBU audit involving dozens of public broadcasters reviewed thousands of assistant replies and found a large share contained at least one significant issue — from missing or incorrect sourcing to temporal errors and invented details.
  • A BBC experiment that fed AI systems with 100 BBC stories and had journalists evaluate outputs reported that over half of the AI-generated responses contained material problems; a notable fraction of responses that quoted BBC contentt introduced factual errors or altered quotations. Those findings emphasize that even when assistants reference major outlets, the transformation into a “summary” can damage fidelity.
  • Journalist-led field experiments focused on local news show retrieval failures and fabricated or broken links when assistants are asked for verifiable sources. One month-long experiment logged hundreds of URLs returned by assistants; a high proportion were broken, incomplete, misattributed, or outright inventions. That kind of brittle retrieval undermines the assistant’s role as a gateway to trustworthy information.
Taken together, these audits demonstrate that the problem is not limited to any single vendor or model — it is systemic to present-day retrieval+generation pipelines.

The geopolitical and cultural tilt: why US/European outlets dominate AI summaries​

Several technical and market forces conspire to privilege large Anglophone and Western outlets in assistant outputs:
  • Indexing footprint. Large international outlets have vast, well-indexed archives and strong SEO signals; retrieval systems are more likely to find and prefer those sources over smaller local sites.
  • Training and alignment bias. Models and ranking systems are often trained on corpora dominated by major outlets and global content, creating a familiarity bias when summarising “authority.”
  • Commercial partnerships and licensing. Tech firms increasingly negotiate licensing arrangements with major publishers to “ground” assistant outputs in licensed archives; those deals naturally favour large, global publishers with negotiating power, potentially crowding out smaller local partners. Industry reporting shows Microsoft has piloted publisher deals and a Publisher Content Marketplace model to license content and provide compensation — but the economics and coverage of those deals are uneven and often opaque.
The result is a structural tilt: assistants often present the world through the lens of large international outlets, reproducing an information geography that sidelines smaller, regional journalism.

Democracy at risk: pluralism, editorial plurality and news deserts​

Koskie warns — and audits corroborate — that the combined effect of automation, poor provenance and commercial consolidation can deepen news deserts and shrink the range of independent voices. This is not a theoretical worry: shrinking local coverage reduces public oversight of local government, lowers civic participation and can centralise agenda-setting in national or international outlets.
The mechanism is straightforward: automated summaries reduce clicks and direct referrals, starving small outlets of the engagement and revenue they depend on. Over time, that dynamic fosters consolidation and fewer investigative beats — a negative feedback loop that AI systems as presently deployed can accelerate rather than mitigate.

Strengths and counterarguments​

AI summarisation is not purely harmful; it offers clear, measurable benefits:
  • Speed and triage. Assistants can quickly surface a condensed briefing that helps busy users orient to breaking developments.
  • Accessibility. Summaries can lower barriers for readers who find long-form reports time-consuming or difficult to parse.
  • Potential for grounding. When paired with robust retrieval and licensing, assistants can provide verified summaries with clear attribution — and some platform efforts are explicitly experimenting with publisher licensing to improve provenance.
Nevertheless, these potential strengths are conditional. Without enforced provenance, transparent linking, and incentives that route users back to original reporting, the net effect risks favoring convenience at the expense of journalistic pluralism and the economic health of local newsrooms.

Policy levers and practical remedies​

Koskie and other researchers propose policy and product interventions to reduce harms and rebalance incentives. The suggestions cluster into four pragmatic approaches:
  1. Extend bargaining and licensing frameworks to include AI experiences. Existing news bargaining codes or media bargaining mechanisms should be adapted to explicitly cover AI-driven summarisation and to ensure transparent remuneration and attribution for content surfaced in assistants. Pilot publisher programmes are a start, but policymakers can set baseline rules to prevent unilateral exclusion of smaller outlets.
  2. Mandate provenance and link-first defaults. Assistants should default to showing clear, clickable source attributions and preserve bylines and publication dates in any news summary. Product designs should prefer a “link-first” model that routes users to the original article rather than burying it behind a generated paragraph. Audits show that improved provenance measurably increases trust and accountability.
  3. Require geographical sensitivity in retrieval. Retrieval systems should incorporate explicit geolocation heuristics and weighting so that queries from a given country surface relevant local outlets unless the user expresses a preference for global coverage. This could be implemented as a configurable policy knob in product UIs and developer APIs.
  4. Independent auditing and transparency obligations. Regular, independent audits of assistant outputs should be required to monitor sourcing quality, bias, and factual fidelity — with results published in accessible summaries. Multilateral audits already reveal systemic issues; institutionalising this oversight would keep vendors accountable.
These levers are complementary: licensing without provenance fails to restore the citations that create publisher value; provenance without economic recompense fails to address the referral loss that threatens newsroom viability.

What vendors and publishers are doing — and why current measures fall short​

Vendors are experimenting with publisher partnerships and content licensing to create a marketplace for grounding content used by assistants. Microsoft, for instance, has piloted publisher deals and describes a Publisher Content Marketplace intended to pay partners and ground answers in licensed archives. That approach can help but comes with trade-offs: most deals so far involve major publishers, terms are often undisclosed, and smaller outlets may lack bargaining power to participate equitably.
Publishers, for their part, are pushing for stronger contractual safeguards, clearer attribution rules and fees that recognise both retrieval uses and potential training rights. But without regulatory guardrails, these negotiations can entrench incumbent advantages and leave independent outlets excluded.

Practical guidance for newsrooms, IT teams and users​

For newsroom managers and local publishers:
  • Prioritise clear metadata and machine-readable signals (structured metadata, sitemaps, content licensing flags) so retrieval systems can detect and prioritise local reporting.
  • Negotiate onboarding to publisher marketplaces where possible, but press for transparency on scope and proof-of-use reporting.
  • Invest selectively in syndication and content formats that preserve bylines and passage-level attribution.
For enterprise IT and platform teams:
  • When deploying assistants internally, require provenance-first configurations and restrict any production use that substitutes AI summaries for certified reporting in decision workflows.
  • Maintain a “verify-first” policy for critical domains (health, legal, compliance) whereby AI summaries must be accompanied by direct links to source materials.
For individual users:
  • Treat AI summaries as orientation tools, not substitutes for original reporting; follow through to primary sources, especially for consequential topics.
  • Prefer assistant settings that show source links and publication dates.

Caveats and where the evidence remains thin​

Several important caveats deserve emphasis:
  • Assistant behavior changes rapidly with model and index updates. Findings from a sample of responses are informative but not immutable; product updates or licensing shifts can alter retrieval patterns. Any snapshot should be treated as indicative rather than definitive.
  • Attribution of causality between AI summarisation and newsroom closures is complex: AI is a compounding factor atop advertising declines, platform market power and changing consumer habits. The risk is real, but disentangling relative contributions requires longitudinal economic study.
  • Some vendor programmes aim to mitigate harms; their efficacy must be judged by published metrics and independent audits, not vendor claims alone. Current pilot agreements lack consistent public disclosure, limiting external assessment.
Where claims about intent, product behaviour or commercial deals are not independently documented, they should be flagged as based on vendor or interviewee accounts until corroborated by contract texts or audit data.

Conclusion: design, policy and the future of local reporting​

The University of Sydney’s findings are a clarion call that automated news summarisation — if left as an opt-in or default behaviour inside widely deployed assistants — will reconfigure how audiences encounter journalism. At its best, AI can help triage vast amounts of information and direct readers to high-quality reporting; at its worst, it can invisibilise the journalists who produce that reporting and accelerate a shift of attention to global incumbents.
The path forward demands a mix of technical redesign, publisher bargaining and public policy to preserve provenance, restore economic incentives for reportage, and embed geographical sensitivity into retrieval systems. Without those corrections, AI assistants risk compounding the very crises in media pluralism they ought to help alleviate. The next phase of this technology’s public tests should be judged not just by headline utility, but by whether it sustains the underlying ecosystems of independent, local journalism that democracies rely upon.

Source: The Guardian Australian journalism ‘sidelined’ in AI-generated news summaries on Copilot, research shows