Nearly 400 News Publishers Sue OpenAI and Microsoft Over Copilot Training Copies

ChatGPT · 2026-06-26T08:53:23-0400

Nearly 400 local and regional newspaper publishers sued OpenAI and Microsoft in the Southern District of New York on June 24, 2026, alleging that the companies copied copyrighted journalism without permission to train and operate products including ChatGPT and Microsoft Copilot. The case is not simply another entry in the expanding AI copyright docket. It is a claim that the economics of local news, already weakened by two decades of platform disruption, are now being absorbed into a new platform layer without payment, credit, or consent. For Windows users and IT departments watching Copilot become a default part of Microsoft’s productivity stack, the lawsuit also reframes generative AI as a supply-chain question: not just what the model can do, but what it was built from.

Local News Turns the AI Copyright Fight Into a Main Street Case

The lawsuit led by Richner Communications lands differently from the earlier blockbuster fight between The New York Times and OpenAI. The Times case framed the dispute around one of the world’s most powerful news brands, with a sophisticated digital business and a large archive of premium journalism. This new complaint is about local and regional publishers, the kind of outlets that cover school boards, zoning hearings, obituaries, police budgets, high school sports, weather damage, restaurant closures, and the mundane civic machinery that rarely travels far beyond a county line.
That distinction matters because local journalism has less margin for abstraction. A national publisher can argue about brand dilution, search substitution, licensing markets, and strategic leverage from a position of institutional weight. A local newsroom argues from scarcity: fewer reporters, thinner ad bases, shrinking print revenue, and a digital ecosystem that often rewards aggregation over original reporting.
The publishers’ core accusation is direct. They say OpenAI and Microsoft used automated systems to crawl their websites, including content behind paywalls and other access controls, copied articles to company servers, stripped away copyright management information, and used the works to train large language models. They also allege that the resulting systems can reproduce identical or substantially similar portions of their journalism when prompted.
OpenAI and Microsoft have long leaned on the argument that AI training is transformative and protected by fair use. Publishers counter that fair use was never meant to let one industry ingest another industry’s paid labor at planetary scale, then sell products that can substitute for the original work. The question courts now face is whether training a model is more like reading, indexing, and learning — or more like copying, storing, and commercially exploiting.

Microsoft Is Not Just a Bystander With a Checkbook

Microsoft’s presence in the case is especially important for the WindowsForum audience because Copilot is no longer an experimental sidebar. It is being threaded through Windows, Microsoft 365, Edge, Bing, Azure, GitHub, security tooling, and enterprise workflows. Microsoft has positioned AI as the next interface layer for computing, and that means the provenance of AI training data is no longer a niche concern for copyright lawyers.
The complaint reportedly emphasizes Microsoft’s commercial partnership with OpenAI, including the company’s early $1 billion investment in 2019 and its later deep integration of OpenAI models into Microsoft products. That framing is designed to prevent Microsoft from being treated merely as a distributor or infrastructure provider. The publishers are arguing that Microsoft benefited from, commercialized, and helped scale the allegedly infringing systems.
This is where the case becomes more than a publisher-versus-lab dispute. Microsoft has sold Copilot as a productivity multiplier for businesses, governments, schools, and consumers. If courts eventually decide that some parts of the training pipeline infringed copyright, the legal blast radius could reach beyond OpenAI’s API and into the enterprise software bundles where Microsoft has made AI feel inevitable.
That does not mean Copilot is about to disappear from Windows. Copyright litigation of this scale usually moves slowly, and remedies can range from damages to licensing arrangements to changes in model behavior or data handling. But the lawsuit sharpens a risk that CIOs and compliance teams have been circling for years: generative AI may arrive inside trusted software before the legal status of its raw materials has been settled.

The Paywall Allegation Is the Part Publishers Want the Court to Feel

The allegation that defendants copied content from behind paywalls and access restrictions is not a decorative flourish. It is central to how publishers want the court to understand harm. Publicly available does not always mean freely usable, and paywalled content is explicitly part of a bargain: readers, advertisers, or institutions pay because the publisher controls access.
If AI developers copied such material anyway, publishers will argue, the case becomes less about the open web and more about bypassing the market. A paywall is not merely a technical feature. It is a business model, a signal of restricted access, and often the difference between keeping a reporter employed and cutting another beat.
This is also why the claim about removing copyright management information matters. Copyright law treats information such as author names, publication identities, notices, and usage terms as part of the machinery that helps owners control and license their work. If a company removes or strips that information before using the content at scale, plaintiffs can argue that the copying was not accidental, incidental, or merely an artifact of messy web data.
The defense will likely resist that characterization. AI companies often argue that large-scale training requires processing diverse text sources, that outputs are not normally copies of inputs, and that the models learn statistical relationships rather than storing articles as a searchable archive. But publishers are trying to show something more concrete: ingestion, disassociation, memorization, and substitution.

The Memorization Claim Is About Market Power, Not Just Parlor Tricks

Generative AI critics often focus on examples where a chatbot reproduces near-verbatim copyrighted text. Those examples are dramatic, but they are not the whole case. A model does not need to regurgitate a full article to affect the market for that article. If it can summarize, synthesize, or answer user prompts with enough detail that the user never visits the publisher, the economic damage may occur without a clean copy-and-paste moment.
That is the deeper anxiety behind this lawsuit. News publishers have spent years optimizing headlines, metadata, subscriptions, newsletters, social feeds, and search traffic only to find that AI assistants may sit above all of those channels. In the old platform bargain, Google or Facebook might capture much of the value, but at least a link could send a reader back. In the AI assistant model, the answer itself becomes the destination.
Microsoft understands this better than most companies because Windows has always been about controlling the surface where users begin work. The Start menu, the browser, Office, Teams, Outlook, search, and now Copilot all act as entry points. If those entry points can answer questions using journalism that Microsoft did not license, the publisher’s concern is obvious: their reporting becomes a hidden ingredient in someone else’s interface.
The companies will argue that AI systems create new value and that users still need authoritative sources. Publishers will respond that authority without traffic, attribution, or compensation is not a business model. Local news cannot pay reporters in exposure to a model’s latent knowledge.

The Lawsuit Joins a Bigger Copyright War That Has Not Yet Found Its Settlement

The Richner-led case joins a growing line of lawsuits from newspapers, authors, reference publishers, and other rights holders. The New York Times sued OpenAI and Microsoft in 2023. Major regional newspapers followed in 2024. Other publishers have filed similar claims since then, and reference brands such as Encyclopaedia Britannica and Merriam-Webster have also challenged the unauthorized use of copyrighted material in AI development.
The common thread is that rights holders believe generative AI companies treated the web as an all-you-can-eat training buffet. The companies, in turn, argue that training on existing works is lawful, technically necessary, and socially beneficial. Both sides understand that the outcome will help determine who captures the next decade of information value.
The courts have not yet delivered the clean, sweeping answer everyone wants. Some claims have survived early motions. Others have narrowed. The hardest questions remain unsettled: whether training is fair use, whether outputs are infringing derivatives, whether memorization changes the analysis, whether removing metadata creates independent liability, and what remedy would be appropriate if infringement is found.
That uncertainty explains why licensing deals have become the parallel track. Some publishers have chosen to negotiate with AI companies rather than sue. Others see litigation as the only way to force a market price. The lawsuit from nearly 400 local and regional newspapers suggests that smaller publishers do not want to be left out of whatever compensation structure emerges.

The Local Journalism Argument Is Also a Competition Argument

The complaint reportedly says the alleged conduct threatens the sustainability of local journalism at a time when the industry is already under severe economic pressure. That line may sound familiar, but it is not mere sentimentality. Local news has already lived through one platform transition in which technology companies captured advertising growth while publishers lost revenue, staff, and leverage.
AI could repeat that pattern in a more concentrated form. Search engines indexed news and sent some readers back to publishers. Social networks distributed links, however imperfectly. AI assistants can consume, compress, and present information without requiring a click. That makes the assistant not just a discovery tool, but a potential replacement for discovery.
For local publishers, the fear is not that ChatGPT will write better city council coverage. The fear is that their archived and current reporting will help power systems that answer local queries, summarize local controversies, and satisfy casual information needs without preserving the economic reason to fund the next meeting, court filing, or public-records request.
This is why the case resonates beyond copyright doctrine. It asks whether the companies building AI systems should internalize the cost of the information ecosystems they rely on. If the answer is no, the market may reward firms that can best ingest existing knowledge while weakening the institutions that produce new knowledge.

Fair Use Is the Narrow Legal Door Carrying a Very Heavy Load

The likely defense will center on fair use, the flexible doctrine that allows certain unlicensed uses of copyrighted works for purposes such as criticism, commentary, research, teaching, and transformation. AI companies have argued that model training transforms source material into a system that generates new outputs rather than republishing the originals. They also argue that large language models do not normally contain human-readable copies of articles in the way a database does.
Publishers will attack that framing on several fronts. First, they will argue that the copying was commercial and massive. Second, they will argue that the copied works were expressive and valuable. Third, they will argue that AI products harm existing and potential licensing markets. Finally, they will point to memorized outputs or close substitutes as evidence that the use is not safely abstracted from the underlying works.
The market-harm factor may be the decisive battleground. If a court sees AI training as analogous to search indexing or text mining, OpenAI and Microsoft gain ground. If it sees the products as competing answer engines built from uncompensated copyrighted expression, publishers gain ground.
For IT pros, this legal distinction may seem remote until procurement teams start asking vendors about indemnity, training data provenance, and model governance. Enterprise adoption often assumes that the legal risk sits with the vendor. But reputational, compliance, and contractual exposure can still flow downstream when AI systems become embedded in regulated workflows.

Copilot Makes the Dispute Feel Less Theoretical for Windows Users

For Windows users, the relevance of this lawsuit is not that ChatGPT exists somewhere on the web. It is that Microsoft has spent the past several years making AI a native expectation across its ecosystem. Copilot is no longer just a chatbot tab. It is an organizing metaphor for how Microsoft wants users to search, write, summarize, code, plan, secure, and administer.
That creates a trust problem. Windows administrators are accustomed to evaluating updates, telemetry, cloud dependencies, identity controls, and endpoint security. Generative AI adds another layer: whether the assistant’s capabilities depend on data practices that courts may later restrict or penalize.
Most users will never inspect model training data, and most administrators cannot audit it directly. They rely on vendor statements, contractual terms, compliance documents, and the behavior of the product. If litigation forces more transparency around training sets, data retention, output filtering, and licensing, enterprise customers may benefit even if they are not directly aligned with publishers.
Microsoft has tried to present Copilot as enterprise-safe, governable, and integrated with existing Microsoft security and compliance controls. The copyright fight complicates that message because it concerns not only customer data but also the pretraining and development history of the models themselves. A tenant admin can control whether Copilot accesses company documents; that does not answer what was used to build the underlying model before it reached the tenant.

The Case Will Not End AI, But It Could Price It Differently

The most realistic outcome is not a judicial order that turns off modern AI. The more plausible future is messier: settlements, licensing pools, narrower training practices, data opt-outs with teeth, stronger provenance systems, and higher costs for companies that want premium content in their models. AI will not vanish if publishers win major concessions. It will become more expensive and more contractual.
That shift would favor the largest AI companies in one sense. Microsoft and OpenAI can afford licensing deals that smaller competitors cannot. A world where training data must be licensed at scale may entrench incumbents with the cash, lawyers, and distribution channels to manage rights. The irony is that a publisher victory against Big Tech could still strengthen Big Tech’s long-term position against smaller AI developers.
But the alternative is not obviously better. If courts bless unrestricted ingestion of copyrighted journalism, the market could push even harder toward extraction without compensation. In that world, the companies with the largest crawlers, compute budgets, and user interfaces capture more of the value created by reporters, editors, photographers, and local institutions.
The law is being asked to draw a boundary after the business model has already raced ahead. That is uncomfortable, but not unusual in technology. The web, search, cloud, mobile, and social media all scaled before regulators and courts fully understood their consequences. AI is repeating the pattern at higher speed.

The Stakes for Publishers Are Concrete, Not Nostalgic

It is tempting to frame newspaper lawsuits as an old industry resisting a new one. That reading is too easy. Publishers are not asking courts to ban people from reading journalism and learning from it. They are challenging automated copying at industrial scale by companies selling commercial products built in part on that copied material.
Local newspapers also occupy a different civic role from many other copyrighted works. A novel, a photograph, a song, and a city hall investigation all deserve legal protection, but only one of them may be the primary record of whether a school district mishandled funds or a county board changed zoning rules. When that work disappears, the public loses more than a media brand.
The lawsuit’s strongest moral argument is that AI companies need a continuous supply of trustworthy human-produced information while their products may reduce the revenue flowing to those who produce it. That is not a stable equilibrium. A model trained on yesterday’s reporting cannot report tomorrow’s fire, indictment, bond measure, flood, or hospital closure.
The strongest counterargument is that overly restrictive copyright rulings could make AI development harder, more expensive, and less open. There is truth in that. But difficulty is not the same as impossibility, and a market that requires payment for valuable inputs is not an attack on innovation. It is how most industries are supposed to work.

A Copyright Fight Built for the Copilot Era

This case should be read less as a single lawsuit than as a sign that the AI industry’s permission problem has moved from elite media to the local press. The concrete points are now hard to ignore.

Nearly 400 local and regional newspapers are accusing OpenAI and Microsoft of copying their journalism without authorization to build and operate generative AI products.
The complaint targets not only public web scraping but also alleged copying of content behind paywalls and other access restrictions.
The publishers say copyright management information was stripped from their works before the material was used in AI training.
Microsoft’s role matters because OpenAI’s models are deeply tied to Copilot, Azure, Microsoft 365, Bing, Edge, and the broader Windows ecosystem.
The case could influence whether AI companies must license more news content, disclose more about training data, or change how models produce news-derived answers.
The outcome will help define whether local journalism becomes a paid input to AI systems or an uncompensated resource extracted by them.

The larger story is not whether AI companies can build useful tools; they clearly can. The question is whether the next interface for computing will be built on a licensing market that recognizes the value of original reporting, or on a legal theory broad enough to convert the internet’s archives into free industrial feedstock. For Microsoft, OpenAI, publishers, and the millions of Windows users now being handed AI as a default layer of software, that distinction will shape not just the future of news, but the trustworthiness of the systems increasingly asked to explain the world.

References

Primary source: MediaNews4U
Published: 2026-06-26T06:50:36.595614

Loading…

www.medianews4u.com
Related coverage: pymnts.com

PYMNTS | 400 Newspapers Sue Microsoft, OpenAI for Alleged Content Theft

A coalition of publishers of nearly 400 local and regional newspapers has filed a suit against OpenAI and Microsoft.

www.pymnts.com
Related coverage: windowscentral.com

Microsoft and OpenAI are still playing the fair use card — even as ChatGPT and Copilot fuel the "death knell for local journalism" | Windows Central

A group of publishers has filed a lawsuit against Microsoft and OpenAI over copyright infringement disputes.

www.windowscentral.com
Related coverage: chatgptiseatingtheworld.com

35 Local & Regional Newspapers sue OpenAI, Microsoft for alleged copyright infringement. 26th suit v. OpenAI and 11th v. Microsoft. – Chat GPT Is Eating the World

35 local and regional newspaper publishers just sued OpenAI and Microsoft for alleged copyright infringement in the training of their AI models with content of plaintiffs scraped from the web. The Complaint alleges: (1) direct infringement, (2) vicarious infringement, and (3) DMCA CMI removal...

chatgptiseatingtheworld.com
Related coverage: courthousenews.com

Loading…

www.courthousenews.com
Related coverage: newsbytesapp.com

Publishers sue Microsoft, OpenAI over alleged content scraping

Publishers owning 400 newspapers have filed a lawsuit against OpenAI and Microsoft, alleging unauthorized use of their articles to develop AI tools like ChatGPT and Copilot.

www.newsbytesapp.com

Related coverage: mlex.com

US local news owners sue Microsoft, OpenAI alleging infringement in AI training | MLex | Specialist news and analysis on legal risk and regulation

MLex summary: Owners and operators of hundreds of local and regional US news outlets sued Microsoft and OpenAI in New York federal court, accusing them of direct and vicarious copyright infringement in the development of Microsoft Copilot and ChatGPT. "Using automated systems, Defendants...

www.mlex.com
Related coverage: securitydone.com

Eight newspaper publishers sue Microsoft and OpenAI over copyright infringement

Eight newspaper publishers sue Microsoft and OpenAI over copyright infringement

securitydone.com
Related coverage: news.bloomberglaw.com

OpenAI, Microsoft Sued by Publishers for Scraping Articles (1)

Publishers that collectively own and operate nearly 400 newspapers are suing OpenAI Inc. and Microsoft Corp. for scraping their content to build products like ChatGPT and Microsoft Copilot without permission or compensation.

news.bloomberglaw.com
Related coverage: axios.com

Major U.S. newspapers sue Microsoft, OpenAI for copyright infringement

The eight papers bringing the suit are all owned by investment giant Alden Global Capital.

www.axios.com
Related coverage: spokesman.com

9 more newspapers sue OpenAI, Microsoft, alleging stolen content used in AI apps

ANAHEIM, Calif. — Nine newspapers owned or managed by MediaNews Group filed a civil lawsuit Wednesday, Nov. 26, against OpenAI and Microsoft, accusing the tech giants of violating copyright law by stealing the news publishers’ content to build and operate the large language models that power...

www.spokesman.com
Related coverage: mediapost.com

Nine Publishers Sue OpenAI And Microsoft For Alleged Copyright Violations 11/28/2025

Nine Publishers Sue OpenAI And Microsoft For Alleged Copyright Violations - 11/28/2025

www.mediapost.com
Related coverage: platkinllp.com

Got it — no more schematics right now. Let it rest. Your brain’s been running on overdrive.

PDF document

www.platkinllp.com
Related coverage: rothwellfigg.com

Loading…

www.rothwellfigg.com
Related coverage: techxplore.com

Loading…

techxplore.com
Related coverage: copyrightsociety.org

Loading…

copyrightsociety.org