AGENTS.md in 2026: Turning Agent Prompts into Reviewable Repo Policy

Six production repositories—OpenAI Codex, Sentry, Apache Airflow, Temporal, Cloudflare’s Workers SDK, and Coder—now show how AGENTS.md is moving from experimental agent prompt file to ordinary engineering infrastructure for real-world software teams in 2026. The important part is not that these files exist. It is that they are beginning to encode the operational knowledge that used to live in senior engineers’ heads, CI failures, and irritated pull-request comments.
AGENTS.md looks deceptively small: a Markdown file in the root of a repository, written in plain English, meant to tell coding agents how to build, test, lint, and behave. But the format’s rise says something larger about the state of AI-assisted development. The industry is quietly admitting that autonomous coding tools do not become reliable merely by being smarter; they become reliable when the repo tells them where the tripwires are.

Digital CI/CD dashboard shows policy-verified pipeline success with lint, tests, build, and security checks.The New README Is Not for Humans​

For decades, README.md has been the handshake between a project and a person. It tells a developer what the project is, how to install it, where to look next, and how to avoid looking foolish on their first day. AGENTS.md is a different handshake, aimed not at persuasion or orientation but at operational containment.
That distinction matters. A README can tolerate ambiguity because a human developer fills the gaps with judgment. A coding agent, even a capable one, fills gaps with probability. If the repo does not say which package manager to use, whether tests must run inside a container, or which generated files are off-limits, the agent may choose something plausible and still break the build.
The six examples highlighted in the source material are useful because they do not treat AGENTS.md as marketing copy for AI. They treat it as repo-local policy. OpenAI’s Codex file leans into linting and style enforcement. Sentry uses the file to stop instruction sprawl. Airflow encodes naming conventions and host-environment guardrails. Temporal tries to shape the agent’s reasoning process. Cloudflare prioritizes monorepo hygiene. Coder turns the file into a behavioral contract.
That spread is the real lesson. AGENTS.md is not a standard because everyone writes the same file. It is a standard because everyone can put the same kind of file in the same place and expect a growing set of tools to look there first.

OpenAI Codex Shows the Power—and the Danger—of Maximum Specificity​

The OpenAI Codex repository is the obvious place to start, partly because Codex helped popularize the convention and partly because its AGENTS.md demonstrates the format at its most aggressive. This is not a friendly onboarding note. It is closer to a linting regime translated into natural language.
The file’s most interesting instructions are not vague reminders to “write clean code.” They are concrete constraints: inline variables in formatting calls where appropriate, collapse certain conditionals, make match statements exhaustive, and avoid wildcard arms when possible. Those are exactly the sorts of review nits and CI failures that waste human time when an agent misses them.
That is the upside of a detailed AGENTS.md. The file can pre-load an agent with the house rules that are too specific for general training data and too tedious to repeat in every prompt. A Rust workspace with strong Clippy preferences does not need a poetic agent; it needs one that knows what CI will reject before it writes the patch.
But Codex also shows where the format can go too far for ordinary teams. A file running hundreds of lines risks becoming another manual nobody reads completely, except now the reader is a model with a limited attention budget and a task competing for context. Long files may work better when the agent is tuned to consume them, but that does not make them a universal template.
The better lesson from Codex is not “write 300 lines.” It is “turn recurring failures into instructions.” If an agent repeatedly touches sandbox-detection logic it should not touch, name the files and forbid the change. If it repeatedly writes a pattern your linter rejects, spell out the preferred pattern. AGENTS.md should be a scar map, not a manifesto.

Sentry Understands That Fragmentation Is the First Failure Mode​

Sentry’s AGENTS.md attacks a more organizational problem: instructions spreading across every tool that has ever shipped an “AI rules” feature. One team adds a CLAUDE.md. Another adds Cursor rules. Someone else updates a Copilot instruction file. Three months later, nobody knows which guidance is canonical, and the agents are receiving different versions of reality.
Sentry’s answer is blunt: AGENTS.md is the source of truth for AI agent instructions. That is a deceptively important design decision. In a multi-agent world, the hardest part is not writing instructions; it is preventing each tool ecosystem from becoming its own configuration silo.
This is especially relevant for WindowsForum’s professional readership because enterprise IT has seen this movie before. The same pattern played out with shell scripts, CI YAML, deployment manifests, policy-as-code, and endpoint-management profiles. The problem is never merely that a file exists. The problem is that five files exist, four are stale, and one is still being applied.
Sentry’s file also prioritizes execution details over architecture prose. It tells agents how to run Python through the project’s virtual environment and why the prefix matters. That is exactly the level of specificity agents need. The wrong interpreter can turn a simple test run into a fake debugging session, and fake debugging is where agent productivity goes to die.
The strategic value here is consolidation. If AGENTS.md succeeds, it will not be because Markdown is magical. It will be because teams use it to collapse agent guidance back into one reviewable, versioned place.

Airflow Proves That Style Rules Are Operational Rules​

Apache Airflow’s example is a reminder that code style is only one kind of style. Documentation conventions, product names, class names, and capitalization rules are also part of a project’s surface area. They are easy for humans to learn socially and easy for agents to get wrong indefinitely.
Airflow’s “Dag” rule is a perfect AGENTS.md candidate because it is both simple and non-obvious. A generic model will likely assume “DAG” is the right spelling everywhere because directed acyclic graph is conventionally capitalized as an acronym. Airflow’s project convention is more nuanced: title case in prose, literal casing for code tokens.
That is not trivia. Documentation consistency is part of product quality, and large open-source projects often carry naming choices that are legible only to regular contributors. An agent editing docs without that context will produce text that looks technically competent but fails project review.
Airflow’s environment guardrail is even more consequential: do not run Python, pytest, or Airflow commands directly on the host; use the project’s development tooling. This is the kind of instruction that protects contributors from accidental damage and protects maintainers from nonsense bug reports caused by contaminated local environments.
The broader point is that AGENTS.md gives maintainers a way to teach agents the customs of a repo, not just the commands. That matters because the most irritating agent mistakes are often not spectacular hallucinations. They are small violations of local convention repeated with machine confidence.

Temporal Bets That Persona Can Shape Engineering Judgment​

Temporal’s AGENTS.md is different because it opens with a role. It tells the agent to behave like an experienced developer working on Temporal, with a background in distributed systems, database engines, and scalable platforms. That is more theatrical than “run make lint-code,” but it reveals another school of thought: maybe agents need not only commands but a framing discipline.
There is a risk here. Persona prompts can become decorative, a kind of incantation developers add because it feels like prompt engineering. A file that tells the agent to be “senior” but gives it no concrete commands is mostly vibes.
Temporal avoids that trap by pairing role framing with process constraints. It tells the agent to review development guidance before implementation and to verify dependencies rather than assuming a library is available. That second rule is particularly important in Go projects, where inventing an import or assuming a package exists can derail an otherwise plausible patch.
For complex distributed systems, this kind of procedural instruction is valuable. The agent is not merely being asked to write code. It is being asked to act in a codebase where changes can affect persistence, scheduling, history, matching, and runtime behavior in ways that are not visible from one file.
Still, persona is the least portable lesson in the six examples. Teams should copy Temporal’s insistence on verification before they copy its tone. The durable instruction is not “be an expert.” It is “check the project’s actual dependencies and architecture before editing.”

Cloudflare Puts the Lockfile Before the Literature​

Cloudflare’s Workers SDK example is perhaps the cleanest expression of what belongs at the top of an AGENTS.md: the rule whose violation has the largest blast radius. In a pnpm-managed monorepo, that rule is simple. Do not use npm or yarn.
That instruction may look mundane, but it is exactly the sort of mundane rule that saves maintainers hours. An agent that runs the wrong package manager can create or modify the wrong lockfile, perturb dependency resolution, and produce a diff that is mostly cleanup damage. The agent may have made a good code change and still turned the pull request into a mess.
Cloudflare’s structure is valuable because it orders guidance by consequence. Package manager first. Build commands next. Scoped test commands close behind. The file does not assume an agent should run the whole world when it can run one package or one pattern.
That is an important design principle for AI coding workflows. Agents are often judged by whether they “complete the task,” but in real repositories completion includes the cost imposed on CI, maintainers, and other contributors. A patch that takes ten minutes to write and an hour to review is not a productivity win.
The Workers SDK file therefore points toward a practical AGENTS.md norm: begin with the commands that prevent repo-wide collateral damage. Put the essay later, if at all.

Coder Treats the Agent as a Coworker With Bad Habits​

Coder’s AGENTS.md is the most socially interesting of the six because it goes beyond build mechanics and tells the agent how to behave. Its “Rule #1” requires explicit permission before breaking any rule. It also bans a familiar piece of AI assistant theater: the reflexive “You’re absolutely right!”
That may sound petty until you have worked with coding agents in a serious review loop. Sycophantic phrasing is not merely annoying; it can blur accountability. An agent that agrees too readily may conceal uncertainty, reverse itself without explanation, or flatter the user into accepting a weak technical premise.
Coder’s file tries to impose a healthier relationship. The agent is expected to ask, push back, and avoid empty agreement. That is the kind of instruction that reflects actual experience with these tools, not vendor demo optimism.
The limitation is obvious: behavioral guidance depends on the agent honoring prose instructions under pressure. A hard command like “use pnpm” is easier to verify than “do not be sycophantic.” Still, the presence of these rules tells us where AGENTS.md is heading. Teams do not just want agents that know how to run tests. They want agents that fit into engineering culture without making the humans worse at their jobs.
This is where the file begins to resemble a working agreement. It cannot replace judgment, but it can state expectations clearly enough that violations become reviewable.

The Security Story Is Bigger Than a Markdown File​

AGENTS.md is also a supply-chain surface. That is not alarmism; it follows directly from what the file does. If autonomous tools read instructions from a repository and act on them, then the repository contains not only source code but operational directives for non-human actors.
A malicious edit to AGENTS.md might not exploit an application directly. It could instead steer an agent into unsafe commands, suppress tests, exfiltrate context through a tool, or normalize changes to files that should be treated as sensitive. Even a careless edit can be harmful if it causes agents to run expensive tasks, skip required validation, or ignore generated-code boundaries.
This is why AGENTS.md belongs in code review with the same seriousness as CI configuration. A change to the file can alter how future patches are produced. In practical terms, that means maintainers should watch for broad permissions, network instructions, credential-handling language, and commands that fetch or execute remote content.
The security implications grow when agents are connected to Model Context Protocol servers, cloud credentials, issue trackers, package registries, or deployment systems. A local instruction file becomes more powerful when the agent can do more than edit text. The more capable the agent, the more important the instruction channel becomes.
Windows administrators will recognize the pattern from endpoint management and PowerShell automation. A script is not dangerous because it is a script; it is dangerous when it runs with authority. AGENTS.md is not dangerous because it is Markdown; it is dangerous when trusted agents treat it as policy.

The Real Standard Is Reviewability​

The open-format story around AGENTS.md is useful, but the more important property is reviewability. Markdown is human-readable. It lives in the repo. It can be diffed. It can be discussed in pull requests. It can be owned by the same maintainers who own the build.
That makes AGENTS.md more attractive than opaque vendor-side agent settings. A team can see when instructions changed and why. A contributor can propose a rule after a repeated failure. A maintainer can reject a vague instruction and ask for a command that actually reproduces the expected behavior.
The best files in the six examples share this quality. They do not ask the agent to be generally excellent. They name commands, paths, tools, and forbidden actions. They reduce ambiguity in places where ambiguity is expensive.
There is a temptation to use AGENTS.md as a dumping ground for every architectural preference and team habit. That should be resisted. The file should be short enough to be read, specific enough to be enforced, and current enough to be trusted.
A stale AGENTS.md may be worse than none at all. If the file says to run a command that no longer exists, an agent may spend its context budget debugging the instructions instead of the code. Like any operational document, AGENTS.md must either be maintained or deleted.

The Six Files Point to One Emerging House Style​

The production examples differ in tone, but they converge on a pattern that teams can adapt without cargo-culting another repo’s rules. Start with the highest-risk local behavior. State commands exactly. Name the files and directories agents should avoid. Keep conventions short and concrete. Treat the file as policy, not prose.
That pattern is especially important because coding-agent ecosystems are multiplying. A team may use Codex for one workflow, Copilot in GitHub, Cursor in the editor, Aider in the terminal, and another tool in CI. Without a shared instruction layer, each agent becomes a separate governance problem.
AGENTS.md does not solve model quality, permissions, secrets management, or review discipline. But it gives teams a common place to encode the repo-specific details that make those systems usable. In that sense, it is less like a prompt and more like an adapter between general-purpose intelligence and local engineering reality.
The six examples also show a useful hierarchy. Environment rules beat style preferences. Dangerous-file guardrails beat persona. Exact commands beat aspirational language. Behavioral norms are worthwhile, but they should sit on top of operational facts, not replace them.

The Agent File Worth Copying Is the One That Names Your Pain​

The practical lesson from these repositories is not that every project needs an elaborate agent constitution. It is that every project using autonomous coding tools should identify the mistakes agents are most likely to make and write those down before the next run.
  • OpenAI Codex shows that recurring lint and style failures can be converted into explicit agent instructions.
  • Sentry shows that one canonical AGENTS.md can prevent guidance from fragmenting across tool-specific files.
  • Apache Airflow shows that documentation naming conventions and environment isolation rules belong in the same operational brief as test commands.
  • Temporal shows that agents working in complex systems need process constraints, especially around dependency verification and architecture review.
  • Cloudflare shows that the highest-blast-radius command belongs at the top, not buried below background prose.
  • Coder shows that communication norms can be part of the contract, even if commands remain easier to enforce.
A good AGENTS.md is therefore not a celebration of AI tooling. It is a modest, version-controlled admission that agents need local supervision. The teams that benefit most will be the ones that write fewer grand instructions and more precise ones: use this tool, run this command, avoid this path, preserve this naming rule, ask before crossing this boundary.
AGENTS.md will not make coding agents trustworthy by itself, but it may become one of the small boring files that makes them governable. As agents move deeper into editors, pull requests, terminals, and enterprise automation, the winning teams will not be the ones with the longest prompts. They will be the ones that turn institutional memory into compact, reviewable rules before the machine has a chance to guess.

References​

  1. Primary source: Security Boulevard
    Published: 2026-06-24T07:03:44.890708
  2. Official source: github.com
  3. Related coverage: codersera.com
  4. Related coverage: codeline.co
  5. Related coverage: thepromptshelf.dev
  6. Related coverage: github.pkg.st
  1. Related coverage: codex.danielvaughan.com
  2. Related coverage: blog.brightcoding.dev
  3. Related coverage: techradar.com
  4. Related coverage: dtx.systems
 

Back
Top