CVE-2026-29181: OpenTelemetry-Go Baggage Headers DoS—Update to 1.41.0

Microsoft has listed CVE-2026-29181 as a high-severity denial-of-service flaw in OpenTelemetry-Go, affecting versions 1.36.0 through 1.40.0 and fixed in 1.41.0, where repeated multi-value baggage HTTP headers can trigger excessive CPU work and memory allocation in instrumented Go services. The bug is not a Windows vulnerability in the classic sense, but it matters to WindowsForum readers because modern Windows estates increasingly depend on Go services, Kubernetes workloads, observability agents, and cloud-native glue code. This is the kind of issue that rarely produces dramatic screenshots, yet can still take down real production systems. It is a reminder that telemetry is now part of the attack surface, not merely a tool for watching it.

Infographic shows Kubernetes service mesh observability risk from repeated multi-value headers exhausting CPU/memory.The Vulnerability Hides in the Plumbing Everyone Forgot Was Exposed​

OpenTelemetry has become the default vocabulary for tracing and metrics across distributed systems. It is the thing many teams add after the first outage, when logs are no longer enough and the service map has become too complicated to reason about manually. That ubiquity gives it a strange security profile: it is infrastructure, but it often enters the codebase as a library dependency rather than as a consciously defended perimeter component.
CVE-2026-29181 sits in that uncomfortable middle ground. The affected OpenTelemetry-Go packages process the W3C-style baggage header, a mechanism for carrying contextual key-value pairs alongside a request as it moves across services. In ordinary use, baggage can help correlate tenant IDs, feature flags, routing decisions, or other diagnostic hints across a trace.
The problem is that HTTP allows repeated header fields, and the vulnerable extraction path treated multiple baggage field values as work to be parsed independently before aggregating the result. Each individual header value could remain within the documented per-value parsing limit, while the total request still forced the server to allocate and parse far more than defenders expected. The attacker does not need credentials, user interaction, or a clever exploit chain; the input is just headers on a network request.
That is why the official severity lands as high. Confidentiality and integrity are not the story here. Availability is.

“Just Headers” Is No Longer a Comforting Phrase​

There was a time when application teams could mentally divide inputs into dangerous payloads and boring metadata. JSON bodies, uploaded files, XML parsers, deserializers, and SQL parameters got attention. Headers were treated as routing dust: necessary, verbose, and mostly someone else’s problem.
That distinction has been obsolete for years. Headers now carry authentication tokens, trace context, feature routing, device posture, language preferences, proxy chains, cache validators, and observability metadata. A single request line and a handful of header fields can steer behavior across an API gateway, a service mesh, an application server, and a collector pipeline before the application handler ever sees a business object.
CVE-2026-29181 exploits that modern reality. The baggage header is designed to move with a request. It is supposed to cross boundaries. In an estate built around microservices, the same malicious pattern may not merely hit one process; it can be propagated, normalized, copied, logged, sampled, or re-emitted by components that were added to make the system more observable.
The subtlety is important. This is not a buffer overflow with a cinematic crash. It is an amplification bug: the attacker supplies a relatively cheap request and causes the target to perform disproportionately expensive work. That makes it especially relevant to internet-facing APIs, ingress controllers, developer portals, partner integrations, and any internal service reachable from a less-trusted network segment.

The Patch Fixes a Library, but the Lesson Belongs to the Architecture​

The affected version range is refreshingly clear: OpenTelemetry-Go 1.36.0 through 1.40.0 are the versions to worry about, and 1.41.0 is the patched release. The impacted packages include the baggage and propagation components, which is exactly where defenders should expect context extraction logic to live.
For developers, the direct action is simple: update the relevant Go modules and rebuild. For platform teams, the work is messier. Go’s module system makes direct dependencies easy to inspect, but production exposure may arrive through a framework, an internal platform library, a sidecar-adjacent service, or a vendor-supplied binary that embeds OpenTelemetry-Go. The vulnerable code path matters only where inbound request headers are extracted by the affected propagator, but in modern services that can be a common pattern.
The security advisory’s proof-of-concept figures show why this deserves attention even without data theft. Under a default Go net/http header budget, a crafted request with many baggage values reportedly produced large per-request allocations and measurable latency increases compared with a single-value baseline. The interesting detail is not the exact number from one machine; it is the shape of the failure. The service accepts input that appears bounded in one dimension, but the implementation multiplies work across another.
That is a familiar security pattern. Rate limits can fail when they count users but not tokens. Upload limits can fail when they cap file size but not decompressed size. Parser limits can fail when they cap a single field but not the cumulative cost of many fields. Here, the per-value cap was not enough because the expensive operation repeated across many values.

OpenTelemetry’s Success Makes Its Bugs Operationally Serious​

OpenTelemetry won mindshare because it solved a real problem. Developers needed a vendor-neutral way to instrument applications without rewriting every trace pipeline each time the backend changed. Cloud providers, SaaS observability vendors, open-source collectors, and platform teams all had incentives to converge on it.
That success changes how we should read an advisory like CVE-2026-29181. A bug in a niche library might affect a few services. A bug in a common telemetry SDK can sit inside payment APIs, internal admin panels, synthetic monitoring endpoints, service mesh demos that became production, and “temporary” Go utilities that now quietly handle critical workflows.
Windows shops are not exempt because the vulnerable code is written in Go. Many Windows-heavy enterprises run Linux containers on Kubernetes, host Go-based control-plane tools, deploy cross-platform agents, or consume vendor appliances built with Go. Even where the desktop estate is Windows and the identity layer is Microsoft-heavy, the application estate often contains Go services stitched into Azure, GitHub Actions, container registries, and CI/CD systems.
This is why Microsoft’s listing of the CVE is notable for the audience here. The issue is not that Windows Update will deliver a patch to every affected component. It will not. The issue is that the Microsoft ecosystem now includes a vast amount of open-source, cloud-native, and developer-supplied code that administrators still have to inventory, update, and defend.

The DoS Is “Remote,” but the Blast Radius Depends on Your Topology​

The CVSS vector is straightforward: network attack vector, low complexity, no privileges required, no user interaction, and high availability impact. That does not mean every deployment is equally exposed. Security severity is a property of the bug; operational risk is a property of where you put the affected code.
An internet-facing Go API that extracts baggage from every request deserves urgent attention. A private service behind a gateway that strips unknown or oversized headers has a different risk profile. A workload that only accepts traffic from a tightly controlled mesh may be safer, though not immune if one compromised internal client can spray crafted requests laterally.
The important architectural question is whether untrusted clients can influence baggage headers that reach OpenTelemetry-Go extraction. Many organizations already sanitize headers at an edge proxy, but “sanitize” often means removing spoofable identity headers while leaving observability headers intact. Trace propagation headers are commonly allowed because breaking them makes troubleshooting harder.
That trade-off is understandable. It is also where the vulnerability lives. Observability headers are operationally useful precisely because they are allowed to cross boundaries. If they are not bounded, normalized, or dropped at trust transitions, they become a convenient path for resource attacks.

Availability Bugs Are Easy to Underestimate Until They Meet Autoscaling​

Denial-of-service vulnerabilities rarely receive the same executive attention as remote code execution or credential theft. No database is dumped. No shell is spawned. No ransom note appears. The service simply gets slow, expensive, or unavailable.
In cloud environments, that can be more than an outage. Resource amplification can drive autoscaling events, noisy-neighbor pressure, elevated garbage collection, request queue buildup, and cascading retries. A small number of attacking clients can cause downstream clients to retry, health checks to fail, pods to churn, and load balancers to redistribute pain across a wider pool.
Go’s runtime is efficient, but it is not magic. Excess allocations increase garbage collector pressure. Increased parsing work consumes CPU. Latency spikes can become timeout storms if upstream services retry aggressively. The result is not always a clean crash; it may look like a brownout, the sort of degraded state that burns incident response time because dashboards show symptoms everywhere and root cause nowhere.
This is where telemetry-related vulnerabilities become ironic. The very machinery used to understand production behavior can be part of the behavior that needs explaining. If baggage extraction happens early in request handling, the system may spend significant work before application-level limits, authentication checks, or business logic protections meaningfully engage.

The Header Budget Was the Wrong Mental Model​

One of the most interesting details in the advisory is the 8192-byte per-value parse limit. On paper, that sounds like a responsible guardrail. In practice, it guarded the wrong unit of work.
Attackers love mismatches between what defenders count and what software actually does. If the software parses each header value separately, then a limit on each value does not necessarily limit the total parse cost. If aggregation appends members across values, then memory use can scale with the number of values as well as the size of each one. If the surrounding HTTP server allows a much larger total header block, the attacker can operate inside the broader protocol budget while defeating the narrower parser assumption.
The fix direction is therefore unsurprising: normalize multi-value baggage into a single effective value or enforce a global budget across all values before parsing. That aligns the limit with the actual work performed. It also matches the principle administrators already apply elsewhere: limit total request cost, not just individual fields.
This is not merely a Go lesson. It applies to any component that accepts repeated headers, repeated query parameters, nested JSON arrays, multi-part form sections, compressed inputs, or chained metadata. Security limits must follow the cost model, not the syntax model.

Windows Administrators Inherit the Dependency Graph​

Traditional Windows administration was built around products, patches, and machines. You inventoried endpoints, tracked KBs, tested cumulative updates, and watched Group Policy or Intune do its work. That model still matters, but it no longer covers the full risk surface of a modern Microsoft-centered environment.
Today’s Windows administrator may also own Azure workloads, GitHub-hosted pipelines, containerized internal tools, Entra-integrated SaaS apps, and Kubernetes clusters running services written in Go. A CVE in OpenTelemetry-Go will not necessarily announce itself through WSUS, Configuration Manager, or a familiar Patch Tuesday workflow. It may appear in Dependabot alerts, container scanner output, SCA dashboards, vendor advisories, or not at all until someone asks the right question.
That shift is uncomfortable because accountability has spread faster than tooling maturity. The application team may own the Go module. The platform team may own the base image. The security team may own the scanner. The operations team may own the outage. CVE-2026-29181 is exactly the sort of vulnerability that falls between those chairs unless organizations have a dependency response process that reaches beyond operating system packages.
The practical response starts with software composition analysis, but it cannot end there. Teams need to know whether affected versions are actually reachable in request paths. They need to know which services extract baggage. They need to know what the edge does with repeated headers. And they need to know whether a vendor appliance or third-party binary contains the vulnerable library even if no source repository is available.

Edge Controls Can Buy Time, but They Are Not a Substitute for Updating​

The clean fix is to move to OpenTelemetry-Go 1.41.0 or later. That should be the default recommendation for any service using affected versions. But production patching takes time, and availability vulnerabilities invite a second track of mitigation while rebuilds move through testing.
Ingress layers can help. Proxies and gateways can cap total header size, limit the number of repeated baggage header fields, or strip baggage at untrusted boundaries where cross-service context is not required. Some organizations may choose to allow trace context while dropping baggage from the public internet, preserving basic request correlation while reducing exposure to arbitrary propagated key-value data.
That choice has operational consequences. Dropping baggage may reduce diagnostic fidelity or break workflows that intentionally use baggage for routing or experimentation. But those uses should be explicit. If no one can explain why external clients are allowed to send arbitrary baggage into production, the safest default is to constrain or remove it at the edge.
Application-level defenses also matter. Services should avoid extracting and processing propagation headers earlier than necessary, especially before cheap request validation. Rate limiting should account for expensive header patterns, not merely request counts. Monitoring should include allocation spikes, garbage collection pressure, and unusual header cardinality, because a DoS that hides in metadata may not show up as a dramatic increase in request body size.

The Supply Chain Signal Is Louder Than the CVE Score​

The score of 7.5 is useful, but it is not the most important part of the story. The real signal is that mature cloud-native libraries are now large enough, common enough, and close enough to the request path that their parser edge cases become production security events.
OpenTelemetry is not unusual in this respect. The same can be said for logging frameworks, JSON libraries, API gateways, service mesh components, authentication middleware, and SDKs that talk to cloud services. The modern application is assembled from layers of code that each make reasonable assumptions. Vulnerabilities often emerge where those assumptions overlap badly.
In this case, one layer assumed a per-value parsing cap provided safety. Another layer, HTTP itself, permitted repeated values within a larger header budget. The result was a gap big enough for remote resource amplification. Nobody needed to break cryptography or bypass authentication; they only needed to understand how the server counted work.
That is why dependency hygiene needs to be treated as operational resilience, not only as security compliance. A scanner finding affected OpenTelemetry-Go versions is not just producing a checkbox item. It is identifying code that may sit on the hot path of every request. The cost of leaving it unpatched is measured in outage probability, incident noise, and cloud spend as much as in breach risk.

The Fix Should Prompt a Look at Baggage Policy​

Most teams adopted trace propagation before they adopted a policy for what should be propagated. That sequence made sense during early observability rollouts: first get traces working, then refine the metadata. Years later, many environments still have the default posture: accept, forward, and hope the headers are sane.
CVE-2026-29181 is a good excuse to revisit that posture. Baggage is not inherently dangerous, but it is more sensitive than many teams realize. It can contain user-related attributes, tenant context, experiment identifiers, or other values that become visible across services and sometimes across vendor boundaries. OpenTelemetry’s own guidance has long warned that baggage travels in headers and can be inspected by parties along the path.
A sensible policy distinguishes between trace identity and arbitrary baggage. Trace IDs and span context are core to distributed tracing. Baggage is optional context, and optional context should have stricter rules at trust boundaries. If a public client can set baggage that internal services later read, log, or propagate, that is a design decision worth documenting rather than an accident worth discovering during an incident.
The same review should include cardinality. High-cardinality metadata is already a cost problem in observability backends. This CVE shows it can also be a request-processing problem before data even reaches the backend. The cheapest metadata is the metadata you never accept from an untrusted source.

The Concrete Work Starts in Repositories and Ends at the Edge​

For engineering teams, the first pass is mechanical. Search Go module manifests for go.opentelemetry.io/otel versions in the affected range. Update to 1.41.0 or later. Rebuild containers and redeploy, rather than assuming a base image refresh will touch statically linked Go binaries.
Then comes exposure mapping. Identify services that use propagation.Baggage or composite propagators including baggage extraction. Confirm whether those services are reachable from external clients, partner networks, lower-trust internal zones, or message-to-HTTP bridges. A vulnerable package in a batch job with no inbound HTTP path is a lower priority than the same package in an API gateway-adjacent service.
Finally, check the network layer. If the edge accepts repeated baggage headers without normalization or limits, that is worth changing regardless of this specific CVE. Header count, total header bytes, and repeated-field behavior should be explicit controls, not defaults inherited from whichever proxy or framework was convenient at the time.
This is also a moment to test incident visibility. Can your dashboards show a spike in request header sizes or repeated header counts? Can your runtime metrics show allocation increases per route? Can your WAF or ingress logs reveal abnormal baggage usage without logging sensitive values? If the answer is no, a future metadata-layer attack may look like generic slowness until it has already done its work.

The Small Header That Should Change Patch Triage This Month​

The immediate response is not complicated, but it should be disciplined. Treat the advisory as a dependency and exposure exercise, not as a Microsoft-only patch event or a generic open-source alert.
  • Services using OpenTelemetry-Go 1.36.0 through 1.40.0 should be rebuilt with version 1.41.0 or later.
  • Internet-facing and partner-facing Go services that extract baggage headers should be prioritized ahead of isolated internal jobs.
  • Edge proxies and ingress controllers should limit repeated baggage headers or strip baggage at untrusted boundaries where it is not required.
  • Security teams should verify statically linked Go binaries and vendor-supplied components, because package managers may not reveal every embedded dependency.
  • Observability teams should review whether baggage is truly needed from external clients and whether propagated metadata has size and cardinality limits.
  • Incident responders should watch for allocation spikes, garbage collection pressure, latency increases, and abnormal header patterns rather than waiting for crashes.
The broader lesson is that telemetry code is production code, and production code needs threat modeling even when it arrives in the name of visibility. The next outage may not come from the business endpoint everyone reviewed; it may come from the context header everyone allowed through because the dashboards looked better with it enabled. CVE-2026-29181 is fixable, bounded, and well described, which makes it a useful warning shot: as distributed systems become more observable, the machinery of observation must be defended with the same seriousness as the services it watches.

References​

  1. Primary source: MSRC
    Published: 2026-06-03T01:47:52-07:00
  2. Related coverage: securityvulnerability.io
  3. Related coverage: advisories.gitlab.com
  4. Related coverage: cert.kenet.or.ke
  5. Related coverage: thewindowsupdate.com
  6. Related coverage: mondoo.com
  1. Related coverage: service.securitm.ru
  2. Related coverage: cisco.com
  3. Related coverage: docs.redhat.com
  4. Related coverage: opentelemetry.io
  5. Official source: github.com
  6. Related coverage: resolvedsecurity.com
 

Back
Top