Hunting Entra ID Assistive Agent Abuse: Correlate Exchange, Graph, Entra Logs

Microsoft Entra ID agent logs are becoming a practical threat-hunting source in June 2026 because assistive AI agents can use delegated OAuth access to act for signed-in users, making malicious Graph and Exchange activity look deceptively human. The uncomfortable lesson is that “on behalf of” is not the same thing as “obvious to defenders.” As Copilot-style workflows normalize agent action, the identity plane is absorbing a new class of ambiguity. Security teams that still treat users, apps, and service principals as separate buckets are going to miss the story in the seams.
The latest reporting around suspicious assistive agent behavior in Entra ID is not a story about one exotic log field. It is a story about correlation becoming the new minimum standard for AI-era detection. A phish sent by an agent on behalf of an employee is not meaningfully explained by Exchange audit data alone, nor by Microsoft Graph activity alone, nor by sign-in logs alone. The trail exists, but Microsoft has scattered it across systems that reward patient investigators and punish dashboards that expect a single smoking gun.

Infographic showing “Chain of Custody” correlating cloud logs across email, Graph API, and identity sign-in.The Agent Did Not Break Identity; It Exposed Identity’s Oldest Weakness​

For years, enterprise identity has lived with a convenient fiction: that a user action, an application action, and an administrative action are cleanly distinguishable events. OAuth already blurred that line. AI agents simply make the blur operationally unavoidable.
An assistive agent is designed to do useful work in the context of a human. It can analyze a spreadsheet, draft a message, summarize data, or call Microsoft Graph APIs using delegated permissions. In Microsoft’s model, that delegation limits blast radius because the agent’s effective access is constrained by the user’s role and the agent’s own permissions.
That is a good security principle, but it is not a complete detection strategy. If a compromised or manipulated agent sends mail, reads mail, or touches files through a sanctioned Graph endpoint, the action may look like the output of a productivity tool doing exactly what it was designed to do. The attack does not need to smash the door open; it can walk through a consented delegation path.
The hard part for defenders is not that the logs are empty. The hard part is that the logs are semantically incomplete until stitched together. A mail event can say what was sent and by whom it appears to have been sent. A Graph event can show the API operation and network source. A non-interactive sign-in can reveal the service principal and token context. None of those is the whole incident by itself.

Consent Becomes the New Beachhead​

The user consent step is where the modern agent story starts to look uncomfortably familiar. Microsoft Graph’s delegated model depends on users or administrators approving scopes that allow an application to operate on a user’s behalf. That pattern is essential to cloud productivity, but it has also been a favorite target for attackers because consent can become persistence with a friendly UI.
The Cyberpress write-up, citing Red Canary research, describes assistive agents requiring user consent to an access_agent scope tied to the agent’s underlying blueprint principal. Once consent is granted, the agent’s effective authorization is described as the intersection of the agent’s permissions and the human user’s assigned roles. That is a narrower model than granting a free-roaming application identity, but “narrower” does not mean “safe.”
Attackers have long understood that OAuth abuse often works best when it looks like user-approved convenience. Techniques such as authorization-code interception, malicious redirect handling, and consent phishing all exploit the gap between what the user thinks they approved and what the system now permits. Agentic workflows add another layer of indirection: the thing acting may not be the human, but it may still inherit just enough of the human’s authority to matter.
This is where many organizations will discover that their consent governance is less mature than their AI adoption plan. If security teams cannot quickly answer which agents have which scopes, who consented, when consent was granted, and whether the agent’s behavior changed afterward, they are not really governing agents. They are trusting them.

The Phishing Email Is Only the First Frame of the Film​

The scenario reported in the research is simple enough: an assistive agent sends an email with a suspicious invoice-themed subject to an external recipient on behalf of a human employee. That is not a science-fiction attack. It is an ordinary phishing move wearing an AI-era identity costume.
A first look in Microsoft Purview Exchange logs gives investigators the visible shape of the event. The records can identify the operation, the recipient, the agent identity, and the human user being represented. That is valuable, but it can also lull teams into stopping too soon.
Exchange sees mail. It does not necessarily explain the true origin of the API call that caused the mail to be sent. If the service path runs through Microsoft infrastructure, the apparent IP context may be a Microsoft proxy rather than the attacker-controlled host that initiated the chain. For incident responders, that difference is not academic; it determines whether the next search pivots to a benign cloud address or to a real network indicator.
The invoice email, then, is not the incident. It is the artifact that tells defenders where to start. The important investigative question is not only “who sent this?” but “which token, which agent, which API call, and which network origin made this possible?”

Microsoft Graph Holds the Missing Network Truth​

The most important pivot in the reported hunt is from Exchange audit data into Microsoft Graph Activity Logs. By extracting identifiers such as UniqueTokenId and ClientRequestId from the Purview record, analysts can line up the email event with the Graph operation that actually dispatched it.
That is where the story sharpens. The Graph logs can reveal the actual source IP that initiated the POST request, the user-agent string, and the endpoint involved. In the reported example, the relevant operation is microsoft.graph.sendMail, which is exactly the sort of API call an attacker would want if the goal is to turn trusted identity into trusted-looking mail.
This is the point at which agent abuse stops being a vague AI risk and becomes ordinary detection engineering. You are hunting POST requests. You are matching request identifiers. You are looking at user agents and IP addresses. You are asking whether a mail-sending Graph call originated from infrastructure that makes sense for the user, the agent, and the organization.
That pragmatism matters. AI security often gets buried under speculative language about autonomous behavior and emergent risk. Here, the risk is legible: an agent with delegated access can be used to call a mail-sending API, and the evidence required to trace it is split across telemetry sources. The novelty is not magic; it is attribution.

Non-Interactive Sign-Ins Tell the Identity Story Microsoft Does Not Label for You​

After Exchange tells investigators what happened and Graph tells them where the API call came from, Entra ID’s non-interactive sign-in logs help confirm what kind of identity flow produced the token. This is the subtle but crucial part of the hunt. Microsoft does not simply hand defenders a single field that says, in plain English, “this was an assistive agent using an on-behalf-of flow.”
Instead, analysts infer the flow from a combination of attributes. The reported fields include the agent identity name, the service principal object ID, and the parent blueprint ID associated with the agent. That combination can show that the agent identity received a token to act on behalf of the user with granted OAuth scopes such as Mail.Send or Mail.ReadWrite.
This is both powerful and frustrating. It is powerful because the evidence exists, and defenders can build repeatable hunting logic around token IDs and service principal metadata. It is frustrating because the burden is on customers to derive meaning from fields that were not designed as a narrative incident report.
Microsoft is hardly alone in this. Cloud security telemetry often exposes raw ingredients before it exposes intent. But in the agent context, that gap matters more, because the difference between legitimate assistance and malicious delegation may be a narrow behavioral deviation rather than a forbidden API call.

The Logs Are There, but the Product Boundary Is the Trap​

Microsoft’s security ecosystem is rich, but richness can become fragmentation. Purview knows about mail activity. Graph Activity Logs know about API calls. Entra ID knows about sign-ins and service principals. Defender products may know about endpoint context, alerts, or identity risk. The attacker does not care where Microsoft draws product boundaries.
This is why single-pane-of-glass rhetoric ages badly in real investigations. A suspicious agent action is not naturally a single-pane event. It is a chain of consent, token issuance, API invocation, and resource operation. Each layer has a different owner, retention profile, schema, licensing dependency, and query habit.
For smaller IT teams, this is the real operational challenge. It is not enough to buy Copilot, enable agents, and assume the Microsoft cloud will make abuse obvious. Someone must decide which logs are enabled, where they are retained, whether they land in a SIEM, and how identity investigators can pivot across request IDs without opening five portals and a spreadsheet.
For large enterprises, the problem becomes scale. Hundreds or thousands of service principals already create noise. Add agent blueprints, delegated scopes, user consent, and automated workflows, and the attack surface becomes less about obvious misconfiguration and more about normal-looking combinations. That is a harder problem than blocking a malicious executable.

Delegated Permissions Limit Damage, but They Also Borrow Trust​

Microsoft’s argument for delegated access is sensible: an app or agent acting on behalf of a user should not automatically get more power than that user has, and the application’s own permissions should constrain what can be done. That model is far better than letting every productivity helper become an all-powerful tenant actor.
But delegated permissions also borrow the user’s trust relationship. If a user regularly sends external email, then an agent sending external email for that user may not stand out. If an executive assistant workflow reads mail and drafts replies, the same primitives that power convenience can power fraud. Least privilege narrows the blast radius but does not eliminate the need to detect abuse inside the radius.
This is especially important for WindowsForum readers managing Microsoft 365 tenants where business pressure to adopt Copilot and related agent features is accelerating. The security review cannot stop at “what can the agent access?” It must also ask “what would malicious use of that access look like, and where would we see it?”
In practical terms, Mail.Send is not just a permission string. It is the ability to create organizationally trusted messages. Mail.ReadWrite is not just inbox convenience. It is access to sensitive communications, potential pretexting material, and opportunities to manipulate evidence. The danger is not that every agent is dangerous; it is that ordinary delegated scopes become higher-value when attached to automation.

Agent Identity Needs Its Own Inventory Discipline​

Most organizations already struggle with service principal sprawl. Old app registrations linger. Test integrations become production dependencies. API permissions accumulate like attic furniture. Agent identities will make that problem more visible, not less.
Security teams need to treat agents as identities with lifecycle, ownership, scope, and behavior baselines. That means naming conventions that survive incident response, not just product demos. It means mapping agent identities back to business owners and use cases. It means knowing which agents can send mail, read mail, access files, create calendar events, or touch administrative surfaces.
The reported fields around service principal name, service principal ID, and parent blueprint ID point toward that inventory model. If an investigation has to begin by asking what a service principal even is, the response is already late. Identity metadata should make it obvious whether an agent belongs to a sanctioned workflow, an abandoned proof of concept, or something no one recognizes.
This is where governance and detection meet. A detection rule that flags an unusual sendMail call by an agent is useful. A detection rule that can also enrich the alert with the agent owner, approved scopes, normal user population, and expected network locations is much better. The difference is asset management, not artificial intelligence.

Attackers Will Aim for the Gray Zone Between Helpful and Authorized​

The best attacks against agentic systems may not look like a rogue robot doing absurd things. They may look like a helpful assistant doing a plausible task at the wrong time, for the wrong recipient, with the wrong prompt history, or from the wrong network path.
That gray zone is where prompt injection, consent abuse, token theft, and business email compromise begin to converge. An attacker does not necessarily need to steal a user’s password if they can manipulate an agent that already has delegated access. They do not need to compromise Exchange directly if they can trigger a Graph mail operation through a trusted workflow.
The defensive answer is not to panic about agents. It is to stop assuming that agent activity inherits innocence from the user it represents. The phrase on behalf of should become a detection prompt in its own right. On behalf of whom? Through which service principal? With which scopes? From which IP? Against which API? At what frequency? Compared with what baseline?
That investigative posture will become increasingly important as Microsoft and others push agents deeper into office productivity, data analysis, workflow automation, and administration. The more useful agents become, the more valuable their delegated actions become to attackers.

Where Windows and Microsoft 365 Admins Should Tighten the Screws​

For Windows-heavy organizations, the endpoint is still part of the story. If a user’s workstation is compromised, an attacker may be able to influence sessions, steal tokens, manipulate browser flows, or exploit local redirect assumptions. Agent logs can show the cloud-side result, but endpoint telemetry may explain how the attacker got there.
This is why identity and endpoint teams cannot treat AI agent abuse as somebody else’s problem. The Entra trail may identify the agent and token. Graph may reveal the API call. Defender for Endpoint or other EDR tools may reveal the process, browser extension, script, or remote session that initiated the chain. The useful answer is rarely confined to one console.
Administrators should also revisit user consent policies. In many tenants, user consent has historically been treated as a productivity convenience rather than a security boundary. Agent adoption should force a more conservative posture: high-impact scopes need administrative review, suspicious publishers need blocking, and consent grants need recurring audit.
Finally, retention matters. If Graph Activity Logs or sign-in logs are not retained long enough to support an investigation that begins days after a suspicious email, the perfect correlation workflow is useless. The hunt described in the reporting depends on being able to line up identifiers across logs. If one leg has aged out, the chain breaks.

The New Hunt Starts With a Token, Not a Hunch​

The concrete lesson from the Red Canary-style approach is that token-centric investigation is becoming more important than user-centric investigation. A username can be a misleading anchor when the action was delegated through an agent. A service principal alone can also mislead when the agent is acting in a user context. The token ties the story together.
Investigators should therefore become comfortable with pivots that begin in one system and end in another. An Exchange event yields a token ID and request ID. Those identifiers lead to Graph activity. The Graph record reveals the source IP, user-agent string, and endpoint. The token ID then maps back into non-interactive sign-in evidence that identifies the agent and flow characteristics.
That workflow is not glamorous, but it is what modern cloud detection increasingly looks like. The indicators are not only domains and hashes. They are request IDs, OAuth scopes, service principal object IDs, and API method names. In a Microsoft 365 estate, those are first-class security artifacts.
Security leaders should resist the temptation to frame this as a niche AI problem. The arrival of agents is simply making delegated access more common, more automated, and more business-critical. The same OAuth mechanics that power normal applications now power assistants that users may trust even more than traditional apps.

The Invoice Email Leaves a Map for the Next Investigation​

The practical message for defenders is that agent abuse can be hunted today, but only if teams deliberately wire together the telemetry that Microsoft exposes in separate places. That demands preparation before an incident, not heroic reconstruction afterward.
  • Security teams should correlate Microsoft Purview Exchange logs, Microsoft Graph Activity Logs, and Entra ID non-interactive sign-in logs when investigating assistive agent activity.
  • Analysts should preserve and pivot on identifiers such as UniqueTokenId and ClientRequestId because they can connect a mail event to the underlying Graph request and token issuance.
  • Agent identities should be inventoried like privileged applications, including their service principal IDs, parent blueprint IDs, owners, approved scopes, and expected behavior.
  • User and admin consent policies should be reviewed before broad agent deployment, especially for scopes that allow mail sending, mailbox reading, file access, or write operations.
  • Detection logic should treat “on behalf of” activity as a distinct class of identity behavior rather than folding it into ordinary user activity or ordinary application activity.
  • Log retention and SIEM ingestion should be tested against real investigation paths, because missing Graph or sign-in data can make agent attribution impossible after the fact.
The larger point is that Microsoft’s agent future will not be secured by treating agents as a novelty category. It will be secured by treating them as identities that issue, receive, and exercise tokens in ways defenders can prove. As Copilot and assistive workflows move from pilot projects into daily work, the organizations that thrive will be the ones that make agent action visible enough to trust, challenge, and, when necessary, shut down.

References​

  1. Primary source: cyberpress.org
    Published: 2026-06-09T10:50:10.588906
 

Back
Top