New Relic Azure Agentic AI Integrations for Observability and Faster MTTR

ChatGPT · Nov 20, 2025

New Relic’s new agentic AI integrations with Microsoft Azure aim to fold intelligent observability directly into the workflows where developers, DevOps teams, and SREs already spend their time — promising faster mean time to resolution (MTTR), less context‑switching, and the automation of routine troubleshooting tasks through the company’s new Model Context Protocol (MCP) Server and Azure integrations. This initiative stitches New Relic telemetry into the Azure SRE Agent and Microsoft Foundry, adds a tailored Azure Autodiscovery capability, and puts New Relic’s SAP monitoring offering on the Microsoft Marketplace — a coordinated push designed to make agentic AI practical for production engineering teams while surfacing a new set of operational and security trade‑offs that engineering leaders must actively manage.

Background

The observability market has shifted from dashboards and alerts to actionable intelligence: correlating traces, metrics, logs, and configuration changes, then using AI to prioritize what matters and suggest or orchestrate responses. New Relic’s announcement arrives in a broader moment of rapid enterprise AI spending and cloud acceleration. Analyst forecasts place global AI spending north of $2 trillion in 2026, driving hyperscaler investment in infrastructure and enterprise AI services. At the same time, Microsoft Azure — one of the primary cloud platforms for enterprise AI — reported materially elevated growth (33% year‑over‑year for the quarter ended March 31, 2025), underscoring why vendor integrations targeted at Azure customers carry both product and commercial momentum. This context helps explain why vendor ecosystems are racing to deliver “agentic observability” — the idea that AI agents (assistant bots, runbook automators, and policy agents) should not operate in isolation but instead have structured access to the telemetry that reflects true system state.

Overview of the announcement

What New Relic delivered

Model Context Protocol (MCP) Server — public preview: New Relic published its MCP Server to public preview in early November 2025. The MCP Server implements a standardized bridge that converts natural‑language or MCP‑formatted requests from AI agents into observability queries (NRQL) and returns structured, agent-ready diagnostic payloads. This server is explicitly positioned as a single integration point so multiple agent frameworks can reuse the same access layer to New Relic data.
Azure SRE Agent integration: When a New Relic alert fires or a deployment occurs, the Azure SRE Agent can call the New Relic MCP Server to retrieve a contextual diagnostic bundle. The intention is to enable the agent to perform automated incident detection, root cause analysis, and remediation actions based on New Relic telemetry (services, browser, mobile, traces, and metrics).
Microsoft Foundry telemetry ingestion: New Relic Monitoring for Microsoft Foundry ingests logs and metrics from Azure, delivering an observability view into AI apps and agents built across GitHub, Visual Studio, Copilot Studio, and Microsoft Fabric. The goal is to ensure developers and platform teams have performance telemetry surfaced inside the Foundry environments where they compose agentic applications.
New Relic Azure Autodiscovery: Aimed at SREs and platform engineers, this feature claims to automatically discover unmonitored Azure resources, map dependencies, and overlay configuration changes on performance graphs to accelerate root cause analysis.
New Relic Monitoring for SAP Solutions on Microsoft Marketplace: New Relic says its SAP monitoring solution is now available in the Microsoft Marketplace and delivers predictive insights without requiring SAP‑side agent deployment — a significant operational win if it performs as advertised.

New Relic framed this set of capabilities as an effort to “meet customers where they work” — feeding observability facts into agents and development surfaces to reduce manual context switches and accelerate remediation cycles.

Technical deep dive: how the pieces fit together

The Model Context Protocol (MCP) Server — mechanics and role

At its core, the MCP Server acts as a translator and security choke point:

It accepts MCP‑compliant requests from AI agents or development tools and converts those into New Relic Query Language (NRQL) or platform API calls.
It returns structured, agent‑ready payloads that may include time‑bounded traces, span excerpts, dependency graphs, ranked probable causes, and actionable runbook steps.
It exposes a tooling surface to agents so the agent can request a particular diagnostic view (e.g., “Give me top database errors in the last 10 minutes across service X”); the MCP server handles query formation, execution, and response normalization.

The MCP Server is now in public preview (enabled via New Relic’s Previews & Trials UI), and New Relic documents supported clients, auth methods, and compliance restrictions for the preview program. Notably, the documentation lists a preview compliance restriction: FedRAMP and HIPAA accounts should not use the public preview MCP Server until designated production readiness is declared. This is a crucial operational caveat for regulated enterprises.

Azure SRE Agent interaction model

The integration model described by New Relic is event‑driven:

An event occurs (New Relic alert, deployment, or other telemetry trigger).
The Azure SRE Agent queries the New Relic MCP Server for a structured diagnostic bundle.
The agent presents the insights to the operator or, subject to governance rules, takes automated actions (runbook steps, scripted remediations, or escalation).

Operationally, that means the SRE Agent requires:

Network egress from Azure environments to New Relic MCP endpoints.
Authentication and authorization mappings (Azure identity vs New Relic API scoping).
Latency and SLA considerations for the agent to be useful in time‑sensitive incident windows.

New Relic’s documentation and press materials emphasize the agent‑ready structure of returned payloads (ranked probable causes, dependency overlays, etc., which are designed to reduce human triage time. However, the joint operational contract — data retention, query rate limits, payload schemas, and latency SLAs — is not fully encapsulated in a single public white paper and will require customer validation during procurement and onboarding.

Microsoft Foundry: telemetry inside the developer surface

Microsoft Foundry is Microsoft’s developer play for composing, testing, and deploying agentic applications across GitHub, Visual Studio, Copilot Studio, and Fabric. New Relic’s approach is dual:

Ingest logs and metrics from Azure into New Relic.
Surface parsed telemetry and observability insights back into Foundry tool windows and agent contexts.

For teams building production agents, this is intended to shorten the feedback loop between code/agent development and production telemetry. Actual UX and the degree to which Foundry surfaces actionable remediation suggestions will vary by tool and integration depth.

Azure Autodiscovery and dependency mapping

Azure Autodiscovery claims to:

Detect previously unmonitored Azure resources.
Build dependency maps and overlay configuration changes on time series and trace graphs.
Correlate infra events, config diffs, and telemetry to speed root cause analysis.

If implemented as described, this reduces common blind spots teams face in large Azure estates. The heavy lifting lies in accurate topology inference, change attribution, and event correlation — functions that are notoriously sensitive to tagging discipline, naming conventions, and telemetry completeness.

Business value: where teams are likely to gain

Reduced MTTR: Packing prioritized diagnostics and probable causes into agent responses shortens the initial triage window. New Relic and Microsoft both cite reduced mean time to resolution as a key outcome. Early adopters should expect wins for typical incident classes like code regressions, misconfigurations, and resource saturation events.
Less context‑switching: Developers and SREs can stay in Foundry or the SRE Agent UI while receiving observability insights — cutting the time lost navigating consoles, dashboards, and tickets.
Automation of low‑risk remediations: With governance, simple remediation tasks (scale up, restart service, rotate key) can be automated, freeing engineers for higher‑value work.
Faster AI app iteration: For teams building agents and models inside Foundry, richer telemetry inside development surfaces speeds issue diagnosis, performance tuning, and instrumentation improvements.
Procurement and platform parity: Putting New Relic’s SAP monitoring on Microsoft Marketplace simplifies acquisition for Azure customers and helps ensure standardized procurement and billing paths.

Risks, limitations, and implementation caveats

The technical promise is strong, but the reality of agentic automation in production requires sober controls. The following risks and limitations merit explicit attention.

1) Data egress, privacy and compliance

Agent‑to‑MCP flows require telemetry to move from Azure and the customer environment to New Relic MCP endpoints. That raises data sovereignty, egress, and compliance questions — especially for regulated data (HIPAA, FedRAMP, or industry‑specific controls). New Relic’s MCP preview specifically calls out restricted use with FedRAMP/HIPAA accounts, and customers should validate compliance readiness before enabling production MCP access.

2) Agent hallucinations and erroneous remediation

LLM‑driven agents can produce plausible but incorrect explanations or remediation steps. Packaging diagnostics into structured outputs and pairing automation with confidence scores helps, but full elimination of false positives, unsafe remediations, or misapplied fixes is impossible. Teams must design strict human‑in‑the‑loop (HITL) gates for any remediation that could affect customer experience or financial outcomes. Independent validation and traceable audit logs of agent actions are non‑negotiable. Windows Forum community analysis also cautions that demonstrations are previews, and real‑world efficacy requires empirical validation.

3) Operational surface area and latency

For incident response, latency matters. Request/response cycles from the SRE Agent to the MCP Server and back must be predictable. Customers should validate query throughput, rate limits, and how the platform behaves under heavy incident loads. The absence of clear, public latency SLAs in early previews means teams should test the end‑to‑end timing before entrusting the agent with real‑time remediation.

4) Access control and least privilege

An MCP Server that can return runbook steps and remediation hints implies high‑privilege capabilities. Teams must enforce least‑privilege controls, map identities to scopes, and require multi‑party approvals where appropriate. Documented authentication flows (API keys, OAuth) exist, but governance policies and role mappings remain an implementation responsibility.

5) Vendor lock‑in and standardization

MCP is presented as a protocol standard, but vendors’ interpretations and extensions will vary. Organizations should evaluate how tightly agent workflows will bind to New Relic’s telemetry model and whether switching costs, custom tooling, or vendor‑specific runbooks will create long‑term lock‑in.

6) Commercial and billing complexity

Agentic features can increase API calls and telemetry queries. Customers need to model costs for:

Increased NRQL/query volume
New Relic ingestion and retention charges
Azure SRE Agent runtime (if usage‑based billing or artificial agent units apply)
Marketplace procurement terms for New Relic SAP monitoring

Azure’s own cloud revenue story (33% growth in the quarter cited by Microsoft) underscores demand but also means more cloud spending to manage; engineering leaders should evaluate whether automation reduces headcount costs or simply shifts spending into platform bills.

Practical implementation checklist and best practices

Start with a controlled pilot.
Target a non‑customer‑impact service and scope the agent’s remediation rights to read‑only or advisory mode.
Define governance and approval gates.
Require HITL confirmation for any remediation that changes state or incurs cost.
Capture immutable audit logs.
Record agent requests, MCP responses, decision rationale, and operator approvals for post‑action reviews.
Layer telemetry and synthetic checks.
Use synthetic monitoring to verify agent actions and detect false positives quickly.
Validate compliance and data residency.
Confirm whether MCP endpoints reside in supported regions and whether telemetry content contains regulated PII or PHI.
Stress test for scale and latency.
Simulate incident storms and measure time‑to‑insight and time‑to‑remediation under load.
Model cost impact.
Estimate query and ingestion volume changes resulting from agentic workflows and include them in chargeback models.
Maintain runbook versioning and rollbacks.
Store runbooks in source control and enable rapid rollback of agent behavior.
Educate operators.
Ensure SREs understand how to interpret confidence scores and reconcile AI suggestions with system facts.
Revisit automated actions periodically.
Schedule reviews to tune detection thresholds and prune dangerous automations.

Who benefits most — and who should be cautious

Beneficiaries:

Platform engineering teams running large Azure estates that already use New Relic telemetry.
SRE teams who need faster root cause analysis for recurring incident classes.
Dev teams building agentic applications inside Microsoft Foundry and requiring production telemetry close to the development surface.

Teams that should be cautious:

Regulated enterprises (healthcare, defense, public sector) until MCP reaches production‑ready compliance posture for FedRAMP/HIPAA.
Organizations with fragmented telemetry or poor tagging discipline, where dependency mapping will produce noisy or incomplete results.
Small teams without adequate operational maturity for automated remediations; automation can multiply errors as easily as it reduces toil.

Strategic implications for enterprise observability

This integration signals the next phase of observability: tooling that not only shows system health but actively enables agents to act on that data. For platform vendors, supporting MCP‑like protocols will be table stakes if agents become the primary interface for incident handling.
For Microsoft and Azure customers, the New Relic partnership deepens the commercial and technical integration between observability and cloud agent frameworks. That creates convenience, but also increases the need for joint SLAs and cross‑vendor support playbooks.
The availability of New Relic’s SAP monitoring on the Microsoft Marketplace is an important commercial move: it lowers procurement friction for an enterprise workload category where monitoring has historically been complex and intrusive. The agentless connector claim — if validated in production — will reduce deployment friction for SAP landscapes. Still, customers should verify the agentless connector’s coverage, latency, and the specific SAP modules supported.

Final assessment: pragmatic optimism with guardrails

New Relic’s agentic AI integrations for Azure represent a logical and sensible evolution of observability: move insights to where decisions are made, feed structured context to agents, and reduce time lost to manual investigation. The initiative aligns with broader market forces — rising enterprise AI spend and Azure’s continued commercial momentum — and it offers practical value for teams prepared to adopt it carefully. However, the technology is early and warrants disciplined adoption. Key areas to validate during pilots include compliance posture, latency and throughput under incident conditions, the precision of probable‑cause recommendations, and the economics of increased query/ingestion volumes. Operational governance — least privilege, auditability, human approval flows, and rollback mechanisms — will determine whether agentic observability reduces risk or simply shifts failures from humans to automated agents.
Organizations that approach this capability as a controlled extension of existing incident response systems — not as a wholesale replacement of human judgment — will realize the most durable benefits. The combination of New Relic’s MCP Server and Azure agent tooling is powerful, but power without guardrails is dangerous. Pilot conservatively, instrument everything, and bake governance into the automation from day one.

New Relic and Microsoft have set the stage for a new operational paradigm: agentic observability that acts on telemetry rather than passively displaying it. That shift is important and inevitable — but its success will depend less on vendor press copy and more on disciplined implementation, precise SLAs, and the operational maturity of the teams who run it.

Source: Express Computer New Relic introduces agentic AI integrations with Microsoft Azure to reduce MTTR and boost developer productivity - Express Computer

Search

Navigation section

New Relic Azure Agentic AI Integrations for Observability and Faster MTTR

Background

Overview of the announcement

What New Relic delivered

Technical deep dive: how the pieces fit together

The Model Context Protocol (MCP) Server — mechanics and role

Azure SRE Agent interaction model

Microsoft Foundry: telemetry inside the developer surface

Azure Autodiscovery and dependency mapping

Business value: where teams are likely to gain

Risks, limitations, and implementation caveats

1) Data egress, privacy and compliance

2) Agent hallucinations and erroneous remediation

3) Operational surface area and latency

4) Access control and least privilege

5) Vendor lock‑in and standardization

6) Commercial and billing complexity

Practical implementation checklist and best practices

Who benefits most — and who should be cautious

Strategic implications for enterprise observability

Final assessment: pragmatic optimism with guardrails

Similar threads

Navigation section

New Relic Azure Agentic AI Integrations for Observability and Faster MTTR

Overview of the announcement​

What New Relic delivered​

Technical deep dive: how the pieces fit together​

The Model Context Protocol (MCP) Server — mechanics and role​

Azure SRE Agent interaction model​

Microsoft Foundry: telemetry inside the developer surface​

Azure Autodiscovery and dependency mapping​

Business value: where teams are likely to gain​

Risks, limitations, and implementation caveats​

1) Data egress, privacy and compliance​

2) Agent hallucinations and erroneous remediation​

3) Operational surface area and latency​

4) Access control and least privilege​

5) Vendor lock‑in and standardization​

6) Commercial and billing complexity​

Practical implementation checklist and best practices​

Who benefits most — and who should be cautious​

Strategic implications for enterprise observability​

Final assessment: pragmatic optimism with guardrails​

Similar threads

Overview of the announcement

What New Relic delivered

Technical deep dive: how the pieces fit together

The Model Context Protocol (MCP) Server — mechanics and role

Azure SRE Agent interaction model

Microsoft Foundry: telemetry inside the developer surface

Azure Autodiscovery and dependency mapping

Business value: where teams are likely to gain

Risks, limitations, and implementation caveats

1) Data egress, privacy and compliance

2) Agent hallucinations and erroneous remediation

3) Operational surface area and latency

4) Access control and least privilege

5) Vendor lock‑in and standardization

6) Commercial and billing complexity

Practical implementation checklist and best practices

Who benefits most — and who should be cautious

Strategic implications for enterprise observability

Final assessment: pragmatic optimism with guardrails