Microsoft and Cisco Unveil Open Multi-Agent AI Stack in Azure

  • Thread Author
Microsoft and Cisco have quietly rewritten a key piece of the agentic AI puzzle by pushing multi-agent observability, identity, and interoperability into Azure’s production fabric — a move that turns theoretical agent teams into auditable, governable cloud services while also exposing new operational and security trade-offs for enterprises. The result is a standards-forward, developer-facing stack centered on the Microsoft Agent Framework, Azure AI Foundry, OpenTelemetry extensions for multi-agent tracing, and broad protocol support for agent-to-tool and agent-to-agent communication such as Model Context Protocol (MCP) and Agent2Agent (A2A). This collaboration — including Cisco’s Outshift incubation group and the Agntcy project donated to the Linux Foundation — creates the plumbing enterprises need to run fleets of autonomous agents, but it also increases the scale, surface area, and responsibility of platform teams managing those agents.

Blue holographic diagram of a cloud-based distributed system with JSON-RPC, streaming, tool calls and observability.Background​

The industry has moved fast from single-turn chat assistants to multi-step, stateful agent systems that can call APIs, access corporate data, persist state, and take actions on behalf of users. Microsoft’s response combines an open-source developer runtime with a managed cloud plane — the Microsoft Agent Framework (an SDK + runtime) and Azure AI Foundry (the agent factory and managed runtime) — while contributing observability semantics into OpenTelemetry to trace agent behavior end-to-end. Cisco’s Outshift group contributed multi-agent telemetry thinking and open tooling (including the Agntcy work) that addresses discovery, identity, messaging, and observability for agent topologies. These moves were framed publicly during the broader agentification rollouts and technical posts describing the Agent Framework and Foundry observability. The announcements are not just marketing: Microsoft has product docs and SDKs showing how tracing and telemetry integrate with existing agent frameworks (Semantic Kernel, LangChain, LangGraph, OpenAI Agents SDK), and Cisco has already moved elements of its Agntcy work to the Linux Foundation to foster wider adoption. Those public signals indicate an intent to standardize the way agents are observed, identified, and discovered across vendors and clouds.

What Microsoft and Cisco Actually Built (Overview)​

  • Microsoft released the Microsoft Agent Framework as an open-source SDK and runtime with official SDKs for Python and .NET, local-first developer ergonomics, and a clear upgrade path to Azure AI Foundry for production hosting. The framework emphasizes structured connectors (OpenAPI and MCP), durable threads/state, and orchestration patterns for multi-agent workflows.
  • Azure AI Foundry provides the cloud runtime and observability surface — including a Foundry Agent Service offering stateful, long-running multi-agent workflows, visual authoring, and a unified telemetry sink. Foundry embeds OpenTelemetry-based instrumentation for agent interactions and traces.
  • Microsoft contributed multi-agent semantic conventions for OpenTelemetry in collaboration with Cisco Outshift to create standardized spans and attributes that represent agent reasoning steps, tool calls, and inter-agent messaging. This makes cross-framework trace correlation possible.
  • Cisco’s Outshift incubation team added enterprise-focused multi-agent components such as discovery and secure messaging, and it has open-sourced the AGNTCY / Agntcy project to the Linux Foundation to address the four pillars it sees as essential: discovery, identity, messaging, and observability.
  • The Model Context Protocol (MCP), introduced by Anthropic and adopted by many vendors, is operationalized in Foundry for tool registration, authentication, and role-based access. Microsoft uses MCP connectors to make enterprise systems (HR, CRM, procurement) available to agents securely and audibly.
Together these pieces aim to let agents find each other, authenticate, delegate tasks, call tools in a structured fashion, and be traced end-to-end for debugging and compliance.

The Technical Anatomy: How the Stack Fits Together​

Microsoft Agent Framework: SDK and Runtime​

The Microsoft Agent Framework is presented as a developer-first library that unifies prior Microsoft research and engineering patterns (notably AutoGen orchestration patterns and Semantic Kernel integration) into a usable runtime. Key capabilities:
  • Agent primitives: role definitions, instruction sets, model bindings, and tool lists.
  • Tool connectors: OpenAPI-first connectors and MCP-enabled connectors for structured tool calls.
  • Workflows & threads: graph-style orchestration with durable checkpoints, retries, and human-in-the-loop gates for long-running business logic.
  • Local-first development: VS Code tooling, emulators, and a migration path into Azure AI Foundry.
This design emphasizes structured automation over brittle prompt-glue, reducing the amount of bespoke integration code needed when moving from prototype to production.

Azure AI Foundry: The Agent Factory and Observability Plane​

Azure AI Foundry complements the Agent Framework with cloud-grade controls:
  • Managed Agent Service: run and coordinate stateful multi-agent workflows at scale.
  • Tracing & observability: OpenTelemetry instrumentation covering LLM calls, tool invocations, agent steps, and cross-agent traces, exportable to Application Insights or any OTLP backend. Microsoft documents how to enable tracing and how the platform maps agent activity into spans for analysis.
  • Governance & identity: Entra-backed Agent IDs, RBAC, and credential management for tool invocations and agent permissions.
  • Developer integration: first-class tracing support for LangChain, LangGraph, Semantic Kernel, and OpenAI Agents SDK so cross-framework agent fleets can be unified under the same telemetry conventions.
These capabilities let platform teams reconstruct the full decision path of an agent’s work: user query → agent plan → tool call → external service result → final action.

OpenTelemetry Extensions: Agent-Aware Semantic Conventions​

OpenTelemetry was designed for distributed tracing of microservices. Microsoft and Cisco’s contributions extend OTel’s semantic conventions to include:
  • Agent reasoning spans that capture model inputs/outputs, intermediate chain-of-thought steps, and agent plan decisions.
  • Tool invocation spans that log API calls, parameters, responses, and associated costs.
  • Inter-agent messaging spans for Agent2Agent handoffs, delegation, and result streaming.
These conventions let teams correlate traces across agents, connectors, and backend systems — a prerequisite for debugging and compliance in regulated enterprises. Microsoft’s documentation explicitly shows how to instrument agents with OTel and how to export to standard backends.

Agent Discovery and Identity: Cisco’s Agntcy​

Outshift’s Agntcy project (donated to the Linux Foundation) addresses the discovery and identity problem at scale:
  • Discovery: an agent directory and DNS-like discovery fabric so agents and services can find each other without expensive custom glue code.
  • Identity verification: tools to verify an agent’s identity and provenance before handing off sensitive context.
  • Messaging: secure, low-latency interactive messaging optimized for agentic conversations.
  • Observability: patterns and APIs designed to emit consistent telemetry and auditing metadata across agent interactions.
These components aim to reduce the integration friction and operational overhead of coordinating thousands (or tens of thousands) of autonomous agents.

Standards and Protocols: MCP and A2A​

Model Context Protocol (MCP)​

MCP — introduced by Anthropic — is quickly becoming the de facto standard for tool exposure and structured tool calls. It provides a JSON-RPC-based client/server architecture, transports (stdio, HTTP + SSE), and standardized schemas for tool inputs/outputs. Microsoft has built MCP support into the Agent Framework and Azure AI Foundry to allow enterprise systems to present themselves as MCP endpoints. This means agents can call enterprise tools deterministically with authentication, auditable logs, and structured I/O rather than brittle prompt engineering. The MCP spec and documentation are public and show examples of integrating storage systems, code execution, and SaaS connectors. Microsoft positions MCP as the “interface layer” — the connector contract that lets agents call tools — while Foundry’s OpenTelemetry work focuses on the “behavioral layer” inside and between agents. In practice, MCP standardizes how tools are described and invoked; Foundry’s tracing captures how agents reason, hand off, and act.

Agent2Agent (A2A)​

A2A is an open protocol for runtime-level agent collaboration — discovery, delegation, streaming results, and lifecycle semantics. When combined with MCP for tool calls, A2A enables multi-vendor agent choreography: agents can delegate tasks to specialists, collect structured results, and be traced end-to-end under the OpenTelemetry conventions.

Real-World Use: Enterprise Examples​

  • KPMG and Clara AI: KPMG used a multi-agent design fabric built on these primitives to orchestrate auditing workflows in its Clara platform. The combination of Agent Framework patterns, Foundry orchestration, and unified tracing allowed KPMG to coordinate agents across audit tasks with minimal context switching and auditable trails. This is a practical example of how observability and governance are mission-critical in regulated professional services.
  • Cisco Webex & Device Edge Agents: Cisco’s agentic efforts in Webex (task agents, notetakers, meeting schedulers) benefit from device-level AI and Outshift’s agent orchestration thinking, including control-plane visibility through Webex Control Hub and AgenticOps. That device-edge/agent-cloud blend is a clear enterprise use case where discovery, identity, and observability are necessary for safe production use.

Strengths: What This Collaboration Gets Right​

  • Standards-first interoperability: Microsoft’s support for MCP, A2A, OpenAPI connectors, and the OpenTelemetry extensions dramatically reduces friction for cross-framework agent fleets. This tilts the vendor landscape toward composability rather than lock-in.
  • End-to-end observability: Extending OpenTelemetry to cover agent reasoning, tool calls, and inter-agent messaging gives platform teams the forensic data necessary for debugging, SLA enforcement, and compliance audits. That traceability addresses one of the most serious enterprise adoption barriers.
  • Concrete governance primitives: Entra-backed Agent IDs, RBAC, and integrated policy gates in Foundry provide the administrative controls enterprises require before letting agents act on sensitive systems.
  • Ecosystem momentum: Cisco donating Agntcy to the Linux Foundation and Anthropic’s open MCP specification create an ecosystem effect that benefits interoperability and community scrutiny.

Risks and Trade-offs: What IT Leaders Must Consider​

  • Expanded attack surface: Agents with the ability to call tools and perform actions become privileged service accounts. If a connector, MCP server, or agent identity is compromised, attackers gain programmatic access to systems that previously required human credentials. Robust secrets management, least-privilege connectors, short-lived tokens, and continuous monitoring are non-negotiable.
  • Prompt-injection and MCP risks: MCP simplifies tool access, but it can also be an abuse channel if server implementations are lax. Prompt-injection-style attacks or misconfigured MCP endpoints can lead to unintended data disclosure. Enterprises must treat MCP servers like any sensitive API and apply strict authentication, input validation, and policy enforcement. Independent reporting flags MCP-related security concerns that require operational safeguards.
  • Non-determinism and debugging complexity: Agentic models are probabilistic. While tracing helps diagnose behavior, non-determinism in reasoning and model outputs complicates reproducibility and root-cause analysis. Observability makes problems visible but does not remove the need for governance controls such as approval gates and task-adherence checks.
  • Cost and compute: Sustained multi-agent operations can be GPU- and inference-cost intensive. Observability itself adds telemetry costs and storage. Teams must budget for model compute, tracing ingestion, long-term retention of logs/traces, and the cost of human oversight. Cisco and Microsoft both emphasize efficiency, but the economic model of large-scale agent fleets is still an evolving constraint.
  • Legal & compliance complexity: Agents calling cross-border services, third-party LMs, or external indices raise data residency and contractual concerns. When multiple vendors and clouds are involved, enterprises must map data flows and contractual protections carefully. Vendor claims about “non-training” or retention must be validated per connector.

Practical Recommendations for Enterprise Teams​

  • Start narrow: run pilots in suggestion/read-only mode before granting write privileges to agents.
  • Inventory connectors: classify MCP endpoints and OpenAPI connectors by sensitivity; apply least-privilege and require human approval for risky operations.
  • Instrument ubiquitously: enable OpenTelemetry tracing for agents, tool calls, and approval gates, and forward to an OTLP backend with retention and access controls.
  • Harden MCP servers: require Entra-backed credentials or equivalent, implement input validation, and monitor for anomalous request patterns.
  • Test incident playbooks: simulate agent compromise and validate rollback procedures, particularly for agents that can change state in CRMs, billing systems, or identity systems.
  • Monitor cost: set budgets and alerts for model inference spend and telemetry ingestion; operate fail-safe modes that degrade agents to read-only when budgets hit thresholds.
  • Gate production rollout: require legal/compliance sign-off for connectors that touch regulated data, and maintain immutable audit trails of agent actions.

How to Verify the Claims (Quick Checklist for Architects)​

  • Confirm Foundry tracing works with your chosen agent frameworks (LangChain, LangGraph, OpenAI Agents SDK) by running a simple agent that calls an OpenAPI endpoint and verifying the trace tree. Microsoft docs provide step-by-step tracer setup and examples.
  • Validate MCP endpoints by running a minimal MCP server and ensuring that authorization, schema enforcement, and transport (HTTP + SSE) behave as expected. Anthropic’s MCP docs include server examples and SDKs.
  • Check agent identity flows: provision a test Agent ID in Entra, assign RBAC, and verify that Foundry logs show the same identity in the trace events. Microsoft’s enterprise guidance shows Entra integration patterns.
  • Confirm discovery and messaging semantics if you plan to use Agntcy capabilities; verify project maturity and Linux Foundation integration before depending on it for production-critical discovery.

Longer-Term Outlook: Where This Is Likely to Go​

  • Expect more standardization: MCP, A2A, and agent-aware OTel semantics have momentum. Broader adoption by major model providers, cloud vendors, and collaboration platforms will further reduce bespoke integration work.
  • Tooling will mature: richer observability UIs, cost-aware orchestration primitives, and declarative policy languages for agents will appear to manage fleets at scale.
  • Regulatory scrutiny will increase: auditability and provenance will be central to compliance frameworks covering agent-driven actions, especially in finance, healthcare, and government use cases.
  • Economic models will evolve: marketplaces for agent skills, per-action pricing, and bundled observability will create new operational trade-offs for platform teams.

Conclusion​

The Microsoft–Cisco axis has turned agentic AI from exploratory demos into a practical enterprise stack by combining an open-source developer runtime with a managed cloud plane, agent-aware OpenTelemetry semantics, and standards-level protocol support like MCP and A2A. This makes it feasible to build auditable, discoverable, and interoperable multi-agent systems that can integrate with enterprise data and services at scale. However, the same advances that make agents powerful also make them riskier: expanded attack surfaces, new classes of integration vulnerabilities, cost and governance challenges, and the perennial non-determinism of model reasoning.
For IT and platform leaders, the right posture is pragmatic: adopt the standards and observability afforded by this new stack, but proceed with strict governance, phased rollouts, careful access control, and robust incident playbooks. When deployed with those safeguards, agent fleets can unlock meaningful automation and productivity gains — but without them, the operational and compliance costs will rapidly overshadow any benefits.
Key takeaways:
  • Microsoft Agent Framework + Azure AI Foundry provide the developer-to-cloud path for agentic apps.
  • OpenTelemetry extensions (joint work with Cisco Outshift) create the tracing semantics necessary for multi-agent observability.
  • MCP standardizes tool access and is being operationalized in Azure for enterprise tool discovery and secure calls.
  • Agntcy aims to solve discovery, identity, messaging, and observability at scale and is now part of the Linux Foundation ecosystem.
Enterprises that intentionally treat agents as first-class systems — with identity, audit, policy, and observability built in — will be best positioned to reap the productivity upside while containing the attendant risks.

Source: SDxCentral How Microsoft and Cisco agentified Azure
 

Back
Top