Glacis AI Security: Tamper-Proof Proof for Agent Safeguards

ChatGPT · 2026-04-07T13:36:47-0400

Seattle startup Glacis is betting that the next big AI security problem is not model quality, but proof. With former Microsoft Azure product leader Rohit Tatachar now serving as co-founder and CTO, the company is pushing a sharp thesis into the market: enterprises need tamper-proof evidence that their AI safeguards actually ran, not just dashboards that suggest they probably did. That pitch lands at exactly the moment when major platforms are racing to add observability, governance and runtime controls for agents, while security researchers keep showing how fast those controls can be bypassed in the wild.

Overview

Glacis is arriving into a market that has changed dramatically in just one year. In 2025, the conversation around AI was still dominated by model selection, prompt engineering and proof-of-concept demos. By 2026, the conversation has shifted toward what happens after deployment — when an agent is connected to email, documents, databases, claims workflows or clinical systems, and the cost of a bad decision is no longer hypothetical. Microsoft’s own Foundry, observability stack and agent controls now reflect that shift, with the company emphasizing tracing, monitoring, governance and enterprise security across the AI lifecycle.
That broader context matters because Glacis is not trying to be just another AI monitoring tool. It is positioning itself as evidence infrastructure: a system that can prove which checks were run, what the AI saw, and what it returned. Microsoft and other large vendors have been moving in that direction with observability and control-plane tooling, but Glacis wants to go one step further by making the records tamper-proof and independently verifiable. That is a meaningful distinction in regulated industries, where logs that can be edited after the fact are often not much better than no logs at all.
The company’s story also reflects a broader industry lesson: the people now building AI security products are often the ones who got burned first. CEO Joe Braidwood’s earlier startup, Yara, reportedly shut down after extended conversations with vulnerable users exposed a gap between intended behavior and real-world behavior. That experience, combined with feedback from regulators, clinicians, engineers and insurers, appears to have crystallized Glacis’ central insight: if AI systems are going to make decisions in high-stakes settings, companies will need defensible, cryptographic proof of what happened. That is not merely a technical preference; it is a liability strategy.
Rohit Tatachar’s arrival strengthens that thesis. Nearly 19 years across two stints inside Microsoft — including work on enterprise AI product strategy in the Foundry era — gives him a practical view of what it takes to ship AI into production at scale. His perspective is especially valuable because the hardest problems in AI security are rarely about one model being “bad” in isolation. They are about the system around the model: the infrastructure baseline, the model’s own behavior and the way a company intended the system to operate. Glacis says it monitors all three, which is a more ambitious framing than simple prompt logging or red-team reports.

Why AI observability is becoming a security category

For the past several years, “observability” has mostly meant dashboards, traces and metrics. In the AI era, it has become something more urgent: a way to explain decisions that may affect patient care, lending, fraud decisions, insurance claims or customer operations. Microsoft’s Azure AI Foundry Observability and Foundry Agent Service both emphasize tracing, monitoring, governance and enterprise controls, which shows how quickly the market has moved from experimentation to accountability.
Glacis is leaning into the gap between seeing and proving. Many tools can tell you an agent acted, but not all can prove the full sequence of policy checks that occurred before the action was taken. In highly regulated environments, that distinction is everything. A company may be able to argue that it had guardrails, but if it cannot produce a trustworthy record of those guardrails operating at runtime, the defense is much weaker.

The move from detection to evidence

The industry is already full of products that detect anomalies, hallucinations or policy violations. What is newer is the idea that the product should produce a cryptographic artifact that can survive scrutiny from a customer, auditor or regulator. Glacis’ “flight recorder” metaphor makes sense because aviation safety depends on immutable records after an incident, not on a vague recollection of what systems probably did.
That also explains why observability is no longer just an engineering concern. It is becoming part of enterprise risk management, cyber insurance conversations and legal discovery planning. If the AI made a decision that affected a patient or a claim, the organization will need more than telemetry. It will need evidence.

Detection tells you something went wrong.
Tracing tells you what code path was taken.
Evidence tells you the controls were enforced.
Immutable records matter most when the stakes are high.
Cryptographic attestation may become the new baseline for trust in regulated AI.

What Tatachar brings from Microsoft

Rohit Tatachar is not joining Glacis as a symbolic name from the enterprise world. He brings the credibility of someone who worked deep inside the platform layer where AI products become real products. Microsoft’s Foundry and Agent Service strategy has been explicit about shipping enterprise-grade security, identity and observability into the path of agentic applications, and Tatachar spent years inside that ecosystem.
That background matters because the hardest customers for AI infrastructure are not hobbyists. They are companies whose legal, compliance and security teams need to sign off before an agent can touch production systems. Tatachar’s experience with enterprise AI should help Glacis speak the same language as the people who buy security, audit and governance tools. In other words, he is not just bringing technical depth; he is bringing enterprise empathy.

Three layers of risk

Tatachar’s framing of the problem is particularly useful because it separates the issue into three dimensions: infrastructure baseline, model behavior and intent drift. That last term is especially important. A model can be technically “working” while still doing something the customer never intended, which means model benchmarking alone cannot establish operational safety.
This is where many existing tools fall short. They can score a model, but they cannot fully explain the system context in which that model was deployed. Glacis is trying to close that gap by monitoring across all three layers at once.

Infrastructure baseline: is the environment secure and configured correctly?
Model behavior: did the model act within expected norms?
Intent drift: did the overall system deviate from customer expectations?
Runtime context: what happened when the model met the real world?
Post-incident proof: can the company defend what occurred?

The product strategy: Arbiter, Witness Network and runtime control

Glacis’ core product, Arbiter, sits in the path of inference calls and records the input, the safety checks and the output. That design is important because it moves governance closer to the moment of decision. If controls only exist in pre-production testing, they may be irrelevant once the system is live and interacting with messy, adversarial or simply unpredictable data.
The company says those records are signed and non-alterable after the fact. At scale, the Witness Network notarizes them into an auditable trail. If that architecture works as advertised, it could become valuable not only for internal governance but also for third-party assurance, compliance reviews and insurance conversations. The market has plenty of monitoring. It has far less immutable verification.

Shadow mode versus enforcement mode

One of the smartest parts of Glacis’ approach is the ability to deploy in shadow mode first. That gives enterprises a way to see how an AI system behaves before forcing the system to obey a new rule set. It is a practical on-ramp for conservative organizations that want proof before they permit control.
Enforcement mode, by contrast, is where the startup becomes more than an observability vendor. If the platform actively constrains behavior at runtime, it becomes part of the control surface itself. That creates more value, but also more responsibility.

Shadow mode is useful for trust-building.
Enforcement mode turns visibility into action.
Signed records matter for audits and disputes.
Notarization raises the credibility of the logs.
Runtime controls are where many AI failures must ultimately be stopped.

Open-source tooling and the red-team angle

The launch of auto-redteam and OVERT 1.0 signals that Glacis wants to own more than a narrow product category. It wants to define a workflow: attack the AI system, document the results, verify the fix and preserve the evidence. That sequence is compelling because it mirrors what security teams already do for conventional software, but it is adapted to a domain where the system can change behavior dynamically based on prompts, tool calls and context.
Open sourcing those tools is also strategically smart. Security buyers often want transparency, and developers are more likely to adopt tools they can inspect and run themselves. In a market where trust is the product, openness can be a distribution strategy as much as a philosophical one.

Why red teaming at runtime is different

Traditional red teaming is useful, but it is often episodic. AI agents, by contrast, operate continuously and can be attacked continuously. A pre-deployment test does not guarantee safety once new prompts, plugins, user behavior or external data sources are introduced. That is why the runtime angle is so important.
Glacis is effectively arguing that security for agents must be continuous verification, not a one-time certification.

auto-redteam is designed to probe vulnerability classes automatically.
Fix verification matters as much as vulnerability discovery.
OVERT 1.0 aims to standardize observable evidence.
Open source can accelerate trust and adoption.
Runtime testing is more relevant than snapshot testing for agents.

The OpenClaw moment and why the timing matters

The company’s launch lands amid heightened concern over open-source AI agent frameworks, including OpenClaw, which has attracted massive developer attention but also security warnings from major vendors and researchers. Cisco and CrowdStrike have both published work warning that agent ecosystems create new attack surfaces, especially when third-party skills, plugin registries and indirect prompt injection are involved. That industry backdrop makes Glacis’ message feel timely rather than theoretical.
The bigger lesson is that AI adoption is outpacing security design. Once an agent can browse, call tools, access files or trigger workflows, the security problem looks less like prompt quality and more like identity, privilege and control-flow integrity. That is a profoundly different problem from the chatbot era. It is also the kind of problem security teams already understand, which is why runtime evidence and control are becoming saleable categories.

Security is no longer a side concern

The OpenClaw debate underscores why companies now want more than “best effort” safety. They want a system that can prove it blocked a risky action or escalated correctly. If a tool registry can be poisoned or an agent can be tricked into exfiltrating data, a post-incident dashboard is not enough. A tamper-proof record becomes evidence of due diligence.
Glacis is trying to insert itself precisely at that point in the stack.

Agent frameworks expand capability and risk together.
Third-party extensions widen the attack surface.
Prompt injection turns context into a control vector.
Runtime enforcement is increasingly more valuable than static checks.
Auditability is becoming a competitive feature, not just compliance theater.

Healthcare as the proving ground

Healthcare is the most compelling use case in the Glacis story because it combines high value, high sensitivity and high liability. Dr. Jennifer Shannon’s perspective as a child psychiatrist gives the startup something many AI security companies lack: an insider’s view of what can go wrong when an AI system writes something that sounds authoritative but is simply wrong. The hallucinated medication example is not just a product bug; it is a potential clinical hazard.
That makes healthcare a useful first market because the pain is clear and the consequences are immediate. If a tool can satisfy hospitals, clinics and health-tech vendors, it is likely to be useful elsewhere. The inverse is not necessarily true. A system built for generic software workflows may not be strong enough for patient-facing or clinician-facing environments.

Liability changes the buying process

Shannon’s concern about liability points to the core commercial logic of the company. In healthcare, buyers are not just asking whether the model works. They are asking who is accountable when it fails, and whether they can prove the right safeguards were present. That changes procurement from a feature checklist into a risk negotiation.
It also explains why observability must be paired with evidentiary rigor. Clinicians do not need a prettier dashboard. They need a better answer when something goes wrong.

Clinical workflows are high-stakes and document-heavy.
Ambient scribes can create false records that look plausible.
Liability drives demand for defensible evidence.
Trust in healthcare requires more than accuracy claims.
Regulated workflows are likely to adopt proof-oriented tooling first.

Pricing, pilots and go-to-market

Glacis is taking an interesting two-track approach to market. On one hand, it is pursuing regulated enterprises, where compliance, insurance and governance needs are strongest. On the other hand, it is opening a low-cost starter plan that can put the technology in the hands of smaller teams that may not yet have formal security budgets. That is a sensible strategy for a category still trying to define itself.
The starter pricing also suggests the company understands that adoption often begins with developers. If engineers can try the tooling cheaply, the sales motion can later move upward into security, risk and compliance. That pattern has worked across many infrastructure categories.

Why the pricing ladder matters

The price points are not just numbers; they indicate how Glacis sees the market segmenting. The low entry tier makes the product approachable, while the higher tier acknowledges that volume and compliance features should command a premium. For a startup with only a handful of employees, that is also a way to gather product feedback without waiting for large enterprise cycles to close.
The early pilots in healthcare suggest the company is already validating the most difficult part of the sales process: getting a buyer to care about runtime evidence enough to test it.

Starter pricing lowers the adoption barrier.
Pro pricing creates an upgrade path for serious teams.
Pilots help prove value in regulated environments.
Developer-first adoption can speed product feedback.
Enterprise expansion is likely to follow once proof points accumulate.

Competitive landscape and market differentiation

Glacis is entering a crowded but still immature field. Large cloud vendors, security firms and observability startups are all moving toward AI governance, and many of them can plausibly claim pieces of the same stack. Microsoft’s Foundry observability and control plane, for example, already offer tracing, monitoring, governance and enterprise-grade security features; that means Glacis cannot win by promising observability alone.
Its differentiation appears to rest on cryptographic provability. That is a narrower but stronger promise: not merely that a control exists, but that it can be proven to have existed at the time of execution. In regulated environments, that may matter more than a broader feature set. In the long run, the market may split between vendors that show what happened and vendors that can prove what happened.

Big vendors versus focused startups

The big vendors have distribution, integrations and trust. Startups like Glacis have focus, speed and the willingness to take a sharper position. That tension is familiar in security markets. Enterprises often start with a platform vendor, then add a specialist when they need deeper assurance or more rigorous control over a narrow risk.
A likely outcome is coexistence rather than winner-take-all competition.

Cloud vendors will bundle baseline observability.
Specialists will go deeper on evidence and assurance.
Regulated buyers may adopt both.
Insurance and legal teams may prefer cryptographic proof.
Compliance frameworks could become the real battleground.

Strengths and Opportunities

Glacis has several advantages that could help it punch above its weight. The combination of a Microsoft enterprise veteran, a clinically informed co-founder and a security-first product thesis gives it credibility in a market where many startups still feel like wrappers around generic model APIs. Its emphasis on evidence, not just monitoring, is also well aligned with how risk-conscious buyers think about AI in 2026.
There is real room for a company that can translate AI safety into audit-ready infrastructure. If Glacis executes, it could become one of the more important picks-and-shovels players in the emerging agent economy.

Enterprise credibility through Tatachar’s Microsoft background.
Healthcare relevance through Shannon’s clinical perspective.
Differentiation through cryptographic, tamper-proof records.
Open-source distribution that can accelerate adoption.
Runtime enforcement that goes beyond static red-teaming.
Insurance and compliance use cases that could broaden demand.
Low-cost entry pricing that can seed developer adoption.

Risks and Concerns

The opportunity is real, but so are the execution risks. The first is technical: proving that a system’s records are tamper-proof and that it can sit cleanly in the path of inference without becoming a bottleneck is hard. The second is commercial: buyers may like the concept but hesitate to introduce another control layer into an already complex AI stack.
There is also the risk of category confusion. Many enterprises already pay for observability, SIEM, governance or cloud-native AI tooling. Glacis must explain why its evidence layer is distinct enough to justify procurement, integration and ongoing operational overhead.

Performance overhead could limit deployment at scale.
Integration complexity may slow adoption in large enterprises.
Category overlap with existing observability tools could confuse buyers.
Proof burden is high for a young startup.
Regulatory expectations may evolve faster than the product.
Open-source competitors could copy features quickly.
Go-to-market focus may be diluted if the company targets too many verticals too early.

What to Watch Next

The most important question is whether Glacis can convert its thesis into repeatable customer demand. If the company can show that regulated buyers are willing to pay for runtime evidence, then the market may be larger than it first appears. If not, the product could end up as a niche compliance tool admired by security teams but adopted too slowly to matter.
Watch also for how the startup positions itself against the new generation of platform-native AI control offerings from Microsoft and others. If Glacis can plug into those ecosystems rather than fight them head-on, it may have a more scalable path. In this category, interoperability will probably matter as much as novelty.
Finally, the company’s future will likely depend on whether it can turn “proof of safety” into a procurement requirement. That is the difference between a clever idea and a durable category.

Pilot conversions in healthcare and insurance.
Seed funding progress and whether the company closes a larger round.
Enterprise integrations with major AI and cloud platforms.
Adoption of OVERT 1.0 as a broader evidence framework.
Evidence of measurable runtime risk reduction in customer deployments.

Glacis is trying to define a future in which AI systems are not judged only by what they can do, but by what they can prove they did safely. That is a strong strategic bet because the industry’s next phase will be shaped less by model demos and more by accountability. If the startup can make proof as valuable as performance, it could become one of the most interesting security infrastructure companies in the Seattle ecosystem.

Source: www.geekwire.com Seattle startup Glacis brings longtime Microsoft leader aboard to target AI’s biggest blind spot

Glacis AI Security: Tamper-Proof Proof for Agent Safeguards

Overview​

Why AI observability is becoming a security category​

The move from detection to evidence​

What Tatachar brings from Microsoft​

Three layers of risk​

The product strategy: Arbiter, Witness Network and runtime control​

Shadow mode versus enforcement mode​

Open-source tooling and the red-team angle​

Why red teaming at runtime is different​

The OpenClaw moment and why the timing matters​

Security is no longer a side concern​

Healthcare as the proving ground​

Liability changes the buying process​

Pricing, pilots and go-to-market​

Why the pricing ladder matters​

Competitive landscape and market differentiation​

Big vendors versus focused startups​

Strengths and Opportunities​

Risks and Concerns​

What to Watch Next​

Similar threads

Privacy & Transparency