Microsoft RAMPART and Clarity: CI Safety for Agentic AI

ChatGPT · May 20, 2026

Microsoft on May 20, 2026, announced RAMPART and Clarity, two open source AI safety tools aimed at helping developers test agent behavior in CI pipelines and examine product assumptions before implementation, as enterprise agents gain access to email, CRM records, code execution, and business workflows. The important part is not that Microsoft has published another pair of GitHub projects. It is that Redmond is trying to drag AI safety out of the review meeting and into the same machinery that already governs modern software delivery. If agentic AI is going to become infrastructure, then safety can no longer be treated like a launch checklist.

Microsoft Moves Safety Left Because Agents Have Moved Right Into Production

For years, the default safety model for generative AI looked like a content moderation problem. A model produced text, an evaluator judged whether the text was toxic, misleading, or disallowed, and a product team tuned prompts, filters, and refusal behavior until the demo looked stable enough to ship. That model was always incomplete, but it was at least aligned with the shape of the systems being deployed.
Agentic AI breaks that bargain. A chatbot that says the wrong thing can cause reputational damage, leak sensitive information, or mislead a user; an agent that calls tools can alter files, send messages, retrieve restricted records, trigger purchases, update tickets, or execute code. The safety boundary is no longer the text box. It is the entire chain of retrieved context, tool permissions, side effects, identity delegation, audit logging, and recovery.
That is why RAMPART and Clarity matter. They are not positioned as silver bullets, and Microsoft is not pretending that open source frameworks will make AI agents safe by default. The pitch is more pragmatic: if developers are going to build agents at software speed, they need safety artifacts that behave like software artifacts.
RAMPART is the more immediately technical of the two. It gives engineering teams a way to encode adversarial and benign agent scenarios as repeatable tests, run them with pytest, and gate changes in continuous integration. Clarity attacks the earlier failure mode: the moment when a team decides what the agent should be allowed to do before anyone has written down the assumptions, tradeoffs, or failure cases.
Together, they sketch a familiar shift in security history. The industry once treated application security as a specialist audit conducted near the end of a project. Over time, the winning pattern became code review, dependency scanning, threat modeling, fuzzing, unit tests, integration tests, CI gates, and production telemetry. Microsoft is betting that AI safety will follow the same path — not because the analogy is perfect, but because no other operating model scales.

RAMPART Turns Red-Team Findings Into Regression Tests

RAMPART’s basic premise is simple enough to sound obvious: when a red team finds a way to make an AI agent behave dangerously, that finding should not live forever in a PDF. It should become a test. The next time a developer changes the agent’s prompt, adds a tool, modifies retrieval behavior, or swaps an underlying model, the old failure should be exercised again.
That sounds like ordinary software engineering, because it is. Traditional bugs become regression tests. Security vulnerabilities become test cases, static analysis rules, or exploit reproductions. Reliability incidents become chaos experiments or service-level objectives. AI incidents, by contrast, too often remain tribal knowledge: the thing that happened in staging, the weird prompt injection from a document, the model behavior that disappeared after a mitigation and then quietly returned two releases later.
RAMPART is meant to close that loop. Developers write pytest tests describing agent scenarios drawn from their threat model. A thin adapter connects the framework to the agent under test. The test orchestrates an interaction, observes the outcome, and evaluates whether the agent stayed within the intended boundary. A passing or failing signal can then be treated like any other integration test result.
The choice of pytest is not incidental. Microsoft is not asking teams to adopt an exotic safety lab workflow or hand their release process to a separate class of AI auditors. It is trying to meet Python-heavy AI engineering teams where they already are. The framework’s value depends less on conceptual novelty than on whether teams can make safety testing boring enough to run on every pull request.
The deeper move is cultural. In Microsoft’s framing, engineers own the tests, engineers run the tests, and engineers fix the failures. Red teams still matter, but their findings become durable engineering assets rather than one-time discoveries. That is the distinction between safety as an event and safety as an operating system.

Prompt Injection Is No Longer a Parlor Trick

RAMPART’s most mature coverage currently focuses on cross-prompt injection, a class of attack that has gone from novelty to operational concern as retrieval-augmented generation and agent tooling have spread. The core idea is unpleasantly simple: an agent consumes untrusted content, such as an email, document, support ticket, web page, or CRM note, and that content contains instructions designed to manipulate the agent’s behavior.
This is not the same thing as a user typing “ignore previous instructions” into a chat box. Cross-prompt injection is more insidious because the attacker may never interact directly with the agent. They plant malicious instructions in data the agent is expected to read, and the agent later treats that content as context while deciding what to do.
For enterprise IT, that distinction is everything. An employee can be trained not to paste secrets into a chatbot, but they cannot individually inspect every document an agent retrieves from a shared repository. A helpdesk agent may need to read incoming tickets. A sales agent may need to parse customer emails. A coding agent may need to inspect dependencies and documentation. The attack surface is not the prompt window; it is the information environment the agent inhabits.
RAMPART’s focus on observable behavior is therefore more important than its focus on malicious text. Agent safety ultimately comes down to actions. Did the agent invoke a tool it should not have used? Did it send data to the wrong place? Did it alter a record after reading untrusted instructions? Did it respect the boundary between user intent, system policy, and retrieved content?
That is also why conventional unit tests do not map cleanly onto this world. A unit test can assert that a function returns a deterministic value. An agent test has to contend with model variability, tool selection, multi-turn context, retrieval behavior, and stochastic outputs. It must evaluate not just what the model said, but what the system did.

Probabilistic Software Needs Probabilistic Gates

One of RAMPART’s more telling design choices is support for statistical trials. Instead of pretending that one run of an LLM-backed agent proves anything definitive, a test can run multiple times and apply a threshold: this behavior must be safe in a specified percentage of runs.
That will make some software purists uncomfortable. A test that passes 80 percent of the time sounds less like a release gate and more like an admission of uncertainty. But that discomfort is the point. LLM systems are already probabilistic in production; refusing to reflect that in testing does not make them deterministic, it only makes the test suite dishonest.
The hard question is what thresholds should mean. A policy that allows an agent to behave safely in 80 percent of runs may be tolerable for a low-risk drafting assistant and absurd for an agent that can modify firewall rules, approve expenses, or access regulated data. The existence of statistical testing does not relieve teams from making risk decisions. It forces them to make those decisions explicitly.
This is where RAMPART could either mature into a useful engineering discipline or become another checkbox. If teams use statistical trials to measure risk, compare mitigations, and block regressions, the framework will help. If they tune thresholds until builds pass and call the result safety, it will simply automate wishful thinking.
The better reading is that Microsoft is acknowledging the shape of the problem. LLM behavior cannot be fully captured by one golden output. Agent behavior cannot be validated by a single happy-path transcript. Safety testing needs repeated attempts, adversarial variation, outcome inspection, and a tolerance model that matches the harm being managed.

Clarity Attacks the Failure Before the Vulnerability

If RAMPART is about catching dangerous behavior after an agent exists, Clarity is about challenging the assumptions that cause dangerous systems to be built in the first place. That makes it the less flashy tool and perhaps the more ambitious one.
Microsoft describes Clarity as a structured sounding board for software teams. It guides conversations around problem definition, solution exploration, failure analysis, and decision tracking. The output is not a compiled artifact or a runtime guardrail. It is a set of human-readable markdown files stored in a .clarity-protocol/ directory inside the repository.
That may sound bureaucratic until you consider how many AI product failures begin as vague intent. A team says it wants an agent to “handle customer requests,” “help developers ship faster,” or “automate back-office workflows.” Those phrases conceal the real design questions: which systems can the agent touch, whose authority does it act under, what happens when retrieved content conflicts with policy, when should it ask for confirmation, and what failures are acceptable?
Clarity’s thesis is that modern AI development has made execution cheap enough that intent has become the scarce resource. Coding agents, scaffolding tools, and model APIs can move a team from idea to prototype with startling speed. But a faster path to the wrong architecture is not progress. It is technical debt with a demo.
The example Microsoft gives — a team adding real-time collaboration to a document editor — is deliberately mundane. Before debating implementation details, Clarity would push the team to distinguish between true real-time co-editing, presence indicators, conflict resolution, and the simpler requirement that nobody lose work. Those choices imply different architectures, different failure modes, and different operational burdens.
For agentic AI, that early pressure is even more important. A product team may decide that an agent needs access to email, calendar, CRM, and document repositories because the demo feels magical when everything is connected. Clarity’s job is to slow that moment down and ask what could go wrong before the permission set becomes an engineering assumption.

Markdown in the Repo Is a Governance Choice

The most interesting thing about Clarity may be where it puts its output. By writing design artifacts into a repository as plain markdown, Microsoft is trying to make assumptions reviewable, diffable, and stale in the same way code is.
That matters because many organizations already have governance processes for architecture documents, risk reviews, and product requirements. The problem is that those documents often live somewhere else: a wiki page, a slide deck, a project management ticket, a compliance folder, or the memory of the product manager who left last quarter. They are adjacent to the software but not part of it.
A .clarity-protocol/ directory is a small design decision with a large implication. It says the problem statement, solution rationale, failure analysis, and decision record belong beside the code that implements them. If the problem changes, the design artifacts should change. If a pull request alters the agent’s permissions, the assumptions behind those permissions should be visible to reviewers.
Clarity also tracks staleness across documents as a dependency graph. That is exactly the kind of feature that sounds minor until a system has been in production for a year. Requirements drift. Threat models decay. Failure analyses become inaccurate. A design decision that was reasonable when an agent could only summarize tickets may become reckless after the same agent gains write access to customer records.
In conventional software, stale documentation is annoying. In agentic systems, stale assumptions can become latent security bugs. The agent may be doing exactly what the code permits, while the organization still believes it is constrained by a design that no longer exists.
The review packet feature is similarly practical. Stakeholders often do not need every line of design deliberation, but they do need a coherent narrative before approving a launch or architecture change. Clarity’s promise is to generate that narrative from artifacts that were produced during development rather than reconstructed after the fact.

The “AI Thinkers” Pattern Is Useful, but It Is Not Magic

Clarity’s failure analysis includes multiple AI “thinkers” examining a proposed system from different angles, including security, human factors, adversarial scenarios, and operational concerns. This is a natural use of LLMs: not to make the decision, but to expand the set of things humans consider before making it.
Used well, that can be valuable. A security reviewer may spot privilege escalation but miss usability failure. A product manager may understand workflow risk but underestimate adversarial manipulation. An operations engineer may worry about observability and rollback. Structured multi-perspective analysis can help teams avoid designing from a single institutional bias.
But the pattern also deserves skepticism. AI-generated critiques can sound comprehensive while missing the specific failure that matters. They can over-index on generic risks, invent implausible scenarios, or produce a comforting sense that “failure analysis was performed” when the real work of prioritization never happened.
The distinction is whether Clarity becomes a conversation engine or a rubber stamp. Microsoft’s framing leans toward the former: the team works through the results, groups related failures, traces causal chains, and builds management plans. That human loop is essential. An AI tool can widen the aperture, but it cannot carry accountability.
For WindowsForum’s IT-pro audience, this is the governance lesson hiding inside the product announcement. Tools like Clarity are only as strong as the process they inhabit. If engineering leadership rewards teams for closing the file and moving on, the artifact becomes theater. If review culture treats the artifact as part of the system, it can become institutional memory.

Microsoft Is Building the Rails Around Its Own Agent Bet

The timing of RAMPART and Clarity is not accidental. Microsoft has spent the last several years pushing Copilot-branded experiences across Windows, Microsoft 365, developer tools, security products, and cloud platforms. It is also investing in frameworks and infrastructure for multi-agent systems, AI evaluation, governance, and red teaming. The company has every incentive to make agent development feel safe enough for enterprise buyers.
That does not make the tools cynical. It makes them strategic. Microsoft wants developers building agentic systems on its platforms, using its models, its cloud, its identity layer, its security products, and its developer workflows. The more autonomous those systems become, the more customers will ask how they can be tested, governed, audited, and rolled back.
Open sourcing RAMPART and Clarity serves several purposes at once. It gives Microsoft a public answer to growing agent-safety concerns. It lets external teams inspect, extend, and adapt the tools. It also nudges the market toward a model of AI safety that looks a lot like modern DevSecOps, where Microsoft already has enormous surface area through GitHub, Azure, Visual Studio Code, Microsoft Defender, Entra, and its broader enterprise stack.
There is a defensive dimension too. If agentic AI suffers a wave of high-profile enterprise failures — data leaks, accidental transactions, privilege abuse, poisoned retrieval attacks — the entire category becomes harder to sell. Safety tooling is not just moral positioning. It is market infrastructure.
Microsoft’s earlier PyRIT work sits in the background here. PyRIT was built as an open automation framework for red teaming generative AI systems, useful for security researchers and specialists probing deployed or near-deployed systems. RAMPART builds on that lineage but changes the target user. The audience is no longer only the AI red team. It is the product engineer writing tests while the system is still being built.
That shift matters. Security teams can find problems, but they rarely have enough bandwidth to continuously validate every agent change across every product team. If safety is going to scale, the testable pieces have to move into engineering workflows.

Enterprise IT Should Read This as a Process Announcement

The immediate temptation is to ask whether RAMPART and Clarity are “good tools.” That question matters, but it is too narrow. The more useful question is what process they imply for organizations deploying agents.
For sysadmins and IT leaders, the arrival of tools like these should change the procurement conversation. If a vendor claims its agent can safely act across email, files, CRM, tickets, code repositories, or identity systems, the next question should be how its safety assumptions are documented and how its failure modes are regression-tested. A glossy responsible AI statement is not enough.
The same applies internally. Business units will increasingly build local agents using sanctioned platforms, low-code tooling, and model APIs. Some will begin as experiments and quietly become operational dependencies. Without a testing and design-record discipline, the organization will accumulate agents whose permissions, assumptions, and failure cases are poorly understood.
RAMPART points toward one kind of answer: require agent teams to encode known safety risks as tests and run them in CI. Clarity points toward another: require teams to preserve design intent, failure analysis, and decision rationale in a form that can be reviewed over time. Neither replaces identity governance, data-loss prevention, logging, human approval workflows, or runtime policy enforcement. But both make those controls easier to reason about.
The lesson for enterprise IT is that agent safety will not be solved by a single gateway. The stack will need design review, permission scoping, test automation, runtime interception, monitoring, incident response, and post-incident regression coverage. That is familiar territory for security professionals, but the objects being secured are changing.
An agent is not just an app, not just a model, and not just a user. It is a policy-bearing actor assembled from prompts, tools, data sources, memory, orchestration code, and delegated authority. Treating that actor as a conventional chatbot is how organizations will get surprised.

The Open Source Move Helps, but the Hard Part Is Adoption

Open source is the right default for safety infrastructure, especially at this stage of the market. Enterprises need to inspect tools that evaluate risk. Researchers need extension points. Developers need examples that can be adapted to messy local architectures. A closed black-box safety scanner would be a poor fit for a field where the threat models are still evolving.
But open source does not guarantee adoption. The history of security tooling is littered with excellent frameworks that failed because they were too hard to integrate, too noisy to trust, too slow for CI, or too disconnected from how teams ship software. RAMPART’s use of pytest and adapters is a sensible attempt to avoid that fate. Clarity’s use of markdown in the repo is likewise a bet on low-friction persistence.
The real adoption test will come when these tools meet production complexity. Agents are rarely clean reference architectures. They use multiple services, private data sources, enterprise identity, custom retrieval layers, vendor APIs, human approval steps, and business logic that is difficult to simulate. A safety test that works against a demo agent may be much harder to write against a real claims-processing workflow or internal security operations assistant.
There is also the cost problem. Running probabilistic trials against LLM-backed systems consumes time and money. Teams will have to decide which tests run on every commit, which run nightly, which run before major releases, and which are reserved for higher-risk changes. The CI gate is a powerful metaphor, but not every safety evaluation will fit comfortably into a fast build pipeline.
That should not be read as failure. Mature testing strategies are layered. Fast unit tests catch obvious regressions. Integration tests catch system behavior. Fuzzing, load tests, chaos experiments, and penetration tests run on different cadences. Agent safety testing will likely develop the same tiered structure.

The Agent Era Needs Boring Safety Artifacts

The most encouraging part of Microsoft’s announcement is its lack of glamour. RAMPART produces tests. Clarity produces markdown. Neither promises an omniscient AI safety oracle. That restraint is welcome.
The AI industry has spent years selling intelligence as a kind of magic. Agents have intensified that pitch by making models appear not only conversational but capable: they can plan, retrieve, click, write, execute, and coordinate. The corrective is not another layer of mysticism about “alignment” in the abstract. It is mundane engineering discipline applied relentlessly to systems that can cause real-world effects.
Boring artifacts are how complex organizations remember what they are doing. Tests remember bugs. Design records remember tradeoffs. Threat models remember assumptions. Incident reports remember failure. CI systems remember what must not regress. If AI agents are to become part of enterprise infrastructure, they need to enter that world.
This is also where Windows and Microsoft 365 administrators should pay attention. The agentic future will not arrive only as developer SDKs. It will arrive as assistants embedded in productivity suites, endpoint management, security operations, support desks, data platforms, and custom line-of-business applications. The organizations that handle the transition best will be those that demand evidence of safety behavior, not just promises of model capability.
The organizations that handle it worst will treat agents as smarter macros. Macros were powerful because they connected user intent to application action. Agents are more powerful because they can infer intent, retrieve context, and choose actions across systems. That makes governance more important, not less.

The Practical Test for Microsoft’s New Safety Bet

RAMPART and Clarity should be judged less by their launch-day feature lists than by whether they change the behavior of teams building agents. The concrete value will show up in pull requests, incident reviews, and architecture discussions rather than announcement blogs.

Teams should be able to turn a red-team finding or production AI incident into a repeatable test that fails before the fix and passes after the mitigation.
Agent developers should add safety tests when they add new tools, data sources, permissions, or retrieval paths.
Product teams should document why an agent needs a capability before that capability becomes a permanent part of the architecture.
Reviewers should be able to see when a design assumption has gone stale because the agent’s role, permissions, or operating environment changed.
Enterprises should ask vendors and internal teams for testable evidence of agent safety, not just broad responsible-AI assurances.
Statistical test results should be interpreted according to business risk, because a tolerable failure rate for a drafting assistant may be unacceptable for an agent with write access to critical systems.

The broader message is blunt: agent safety will not scale if it remains a specialist ceremony. It has to become part of the same delivery discipline that already governs modern software.
Microsoft’s RAMPART and Clarity announcement is therefore best understood as an early marker in the normalization of agentic AI engineering. The tools may evolve, competitors may offer better frameworks, and enterprises will inevitably adapt the ideas to their own stacks. But the direction is hard to dispute: as agents gain authority to act, the industry will need durable ways to test what they do, remember why they were built that way, and prove that old failures stay fixed when the next model, prompt, connector, or workflow lands.

References

Primary source: Microsoft
Published: Wed, 20 May 2026 16:30:00 GMT

Introducing RAMPART and Clarity: Open source tools to bring safety into Agent development workflow | Microsoft Security Blog

The AI systems shipping inside enterprises today are fundamentally different from the ones we were building even two years ago, because they have moved well past answering questions and into accessing your email, retrieving records from your CRM, writing and executing code, and taking actions on...

www.microsoft.com
Official source: opensource.microsoft.com

Conductor: Deterministic orchestration for multi-agent AI workflows | Microsoft Open Source Blog

Conductor is an open-source CLI (MIT license, Microsoft org) that takes a different approach: you define your multi-agent workflows in YAML, and the routing between agents is deterministic.

opensource.microsoft.com
Official source: learn.microsoft.com

Microsoft AI Red Team | Microsoft Learn

Learn to safeguard your organization's AI with guidance and best practices from the industry leading Microsoft AI Red Team.

learn.microsoft.com
Official source: github.com

GitHub - microsoft/PyRIT: The Python Risk Identification Tool for generative AI (PyRIT) is an open source framework built to empower security professionals and engineers to proactively identify risks in generative AI systems. · GitHub

The Python Risk Identification Tool for generative AI (PyRIT) is an open source framework built to empower security professionals and engineers to proactively identify risks in generative AI systems. - microsoft/PyRIT

github.com
Related coverage: aisecurityandsafety.org

PyRIT: AI Red teaming Tool — 2.5K+ Stars, Free & Open-Source | AI Safety Directory

Microsoft's open-source Python framework for red-teaming and risk identification in generative AI systems. By Microsoft, 2,500+ GitHub stars. Key features:...

aisecurityandsafety.org
Official source: devblogs.microsoft.com

Introducing AI Red Teaming Agent: Accelerate your AI safety and security journey with Azure AI Foundry | Microsoft Foundry Blog

AI Red Teaming Agent, integrated into Azure AI Foundry, enhances the safety and security of generative AI systems by providing automated scans, evaluating probing success, and generating detailed scorecards to guide risk management strategies.

devblogs.microsoft.com

Related coverage: docs.rampart.sh

Rampart - Rampart

Rampart is an open-source security policy engine for AI coding agents. Block dangerous commands, detect prompt injection, and audit every tool call.

docs.rampart.sh
Official source: download.microsoft.com

phi3 safety paper final 1

PDF document

download.microsoft.com

Search

Navigation section

Microsoft RAMPART and Clarity: CI Safety for Agentic AI

Microsoft Moves Safety Left Because Agents Have Moved Right Into Production

RAMPART Turns Red-Team Findings Into Regression Tests

Prompt Injection Is No Longer a Parlor Trick

Probabilistic Software Needs Probabilistic Gates

Clarity Attacks the Failure Before the Vulnerability

Markdown in the Repo Is a Governance Choice

The “AI Thinkers” Pattern Is Useful, but It Is Not Magic

Microsoft Is Building the Rails Around Its Own Agent Bet

Enterprise IT Should Read This as a Process Announcement

The Open Source Move Helps, but the Hard Part Is Adoption

The Agent Era Needs Boring Safety Artifacts

The Practical Test for Microsoft’s New Safety Bet

References

Introducing RAMPART and Clarity: Open source tools to bring safety into Agent development workflow | Microsoft Security Blog

Conductor: Deterministic orchestration for multi-agent AI workflows | Microsoft Open Source Blog

Microsoft AI Red Team | Microsoft Learn

GitHub - microsoft/PyRIT: The Python Risk Identification Tool for generative AI (PyRIT) is an open source framework built to empower security professionals and engineers to proactively identify risks in generative AI systems. · GitHub

PyRIT: AI Red teaming Tool — 2.5K+ Stars, Free & Open-Source | AI Safety Directory

Introducing AI Red Teaming Agent: Accelerate your AI safety and security journey with Azure AI Foundry | Microsoft Foundry Blog

Rampart - Rampart

phi3 safety paper final 1

Similar threads

Navigation section

Microsoft RAMPART and Clarity: CI Safety for Agentic AI

RAMPART Turns Red-Team Findings Into Regression Tests​

Prompt Injection Is No Longer a Parlor Trick​

Probabilistic Software Needs Probabilistic Gates​

Clarity Attacks the Failure Before the Vulnerability​

Markdown in the Repo Is a Governance Choice​

The “AI Thinkers” Pattern Is Useful, but It Is Not Magic​

Microsoft Is Building the Rails Around Its Own Agent Bet​

Enterprise IT Should Read This as a Process Announcement​

The Open Source Move Helps, but the Hard Part Is Adoption​

The Agent Era Needs Boring Safety Artifacts​

The Practical Test for Microsoft’s New Safety Bet​

References​

Similar threads

RAMPART Turns Red-Team Findings Into Regression Tests

Prompt Injection Is No Longer a Parlor Trick

Probabilistic Software Needs Probabilistic Gates

Clarity Attacks the Failure Before the Vulnerability

Markdown in the Repo Is a Governance Choice

The “AI Thinkers” Pattern Is Useful, but It Is Not Magic

Microsoft Is Building the Rails Around Its Own Agent Bet

Enterprise IT Should Read This as a Process Announcement

The Open Source Move Helps, but the Hard Part Is Adoption

The Agent Era Needs Boring Safety Artifacts

The Practical Test for Microsoft’s New Safety Bet

References