CVE-2025-40311 Linux Kernel Fix: VM_MIXEDMAP Guard for Habanalabs DMA

  • Thread Author
A recently registered Linux kernel CVE — CVE-2025-40311 — corrects a subtle but real kernel-mapping bug in the Habanalabs accelerator driver that could cause kernel crashes when user-requested coherent DMA buffers are allocated from the vmalloc range under IOMMU-enabled systems; the upstream patch forces the mapping VMA to include VM_MIXEDMAP when such vmalloc-backed coherent memory is mapped, preventing vm_insert_page from hitting a VM_PFNMAP BUG and avoiding host instability.

Neon Linux Kernel shield with Tux the penguin on a glowing circuit board beside a Habana module.Background / Overview​

The habanalabs driver family (drivers/accel/habanalabs) supports Habana Labs accelerators (Gaudi and Gaudi2) which are widely used for on-prem and cloud AI workloads. Those drivers allocate coherent DMA buffers for device command/control paths and export them to user space via mmap-like interfaces. On systems with IOMMU enabled, dma_alloc_coherent(..., GFP_USER, ... can return a CPU-side pointer that lives in the kernel’s vmalloc address range instead of linear (physically contiguous) memory. Mapping such vmalloc-backed CPU addresses into a process VMA requires special VMA flags (VM_MIXEDMAP) because the memory is not represented by plain PFNs; otherwise the kernel helper vm_insert_page will hit a VM_PFNMAP restriction and trigger a BUG_ON, crashing the kernel. The patch for CVE-2025-40311 detects vmalloc addresses and sets VM_MIXEDMAP on the VMA before mapping; the memory remains driver-managed and is not directly user-accessible. This is a correctness/availability bug: a locally triggered mapping path can produce kernel oops/panic (denial-of-service). There is no authoritative public evidence that this CVE enables remote code execution or has been weaponized in the wild as of the disclosure; the fix is a stability and safety measure.

What went wrong — technical anatomy​

dma_alloc_coherent, GFP_USER and vmalloc​

  • dma_alloc_coherent is the kernel API drivers use to obtain memory that is coherent for DMA and CPU access.
  • With GFP_USER and an enabled IOMMU, the allocator may place returned CPU virtual addresses into the vmalloc region, because the DMA mapping path and IOMMU translations permit non-linear backing while still providing device-accessible addresses.
  • vmalloc memory lives in a different virtual area space than kmalloc (linear) memory and cannot be assumed to be representable as a simple contiguous PFN sequence in all mapping helpers.

VM_PFNMAP, VM_MIXEDMAP and vm_insert_page​

  • VM_PFNMAP is a VMA flag used when a mapping is represented by page frame numbers (PFNs) rather than ordinary page-backed mappings. The kernel enforces strict constraints for VM_PFNMAP VMAs.
  • vm_insert_page expects VM_MIXEDMAP to be set if it is asked to insert page mappings where the source is vmalloc or otherwise not purely PFN-map friendly; if VM_MIXEDMAP is absent, vm_insert_page may invoke a BUG_ON under the VM_PFNMAP restriction.
  • The habanalabs mmap flow previously set VM_PFNMAP and VM_IO but did not detect the vmalloc-backed coherent case; mapping that memory could therefore hit vm_insert_page’s check and crash. The change sets VM_MIXEDMAP when cpu_addr is in vmalloc range to make the subsequent mapping safe.

The actual code change (high level)​

  • The upstream patch inserts an is_vmalloc_addr(cpu_addr) check in the gaudi and gaudi2 mmap handlers and calls vm_flags_set(vma, VM_MIXEDMAP) before invoking dma_mmap_coherent (or the remap path for older configs).
  • The patch is small (26 insertions across two files in the stable review), surgical in intent, and designed to be easy to backport to stable kernels. The commit was signed and reviewed in the kernel stable series.

Confirming the fix and the public record​

Independent trackers and the NVD record the same technical summary: mapping vmalloc-backed coherent memory without VM_MIXEDMAP can lead to vm_insert_page triggering a BUG_ON due to VM_PFNMAP; the upstream remedy is to detect vmalloc addresses and set VM_MIXEDMAP in the VMA before mapping. The NVD entry and several downstream trackers (OSV, SUSE security pages and distribution security trackers) contain the CVE text and map the upstream commit(s). The patch appears in stable-kernel review threads (6.12 / 6.17 stable series posting) and is included in the stable update queues for backporting. Microsoft’s public vulnerability attestation machinery (VEX/CSAF) has started publishing product-level mappings for upstream kernel CVEs; Microsoft’s inventory-first attestation pattern is to identify the initial product (Azure Linux) that includes the upstream component and update the attestation set as further artifacts are verified. That operational approach is reflected in internal advisories discussing how Microsoft maps kernel CVEs to their product artifacts; operators should not treat a single product attestation as an exhaustive statement about every Microsoft image or kernel build. Those operational caveats matter if you run Microsoft-provided kernels, WSL kernels, or Azure Marketplace images that may use different kernel configurations.

Who is affected — scope and risk model​

  • Affected component: Linux kernel habanalabs driver code paths that mmap driver-allocated coherent buffers (Gaudi/Gaudi2 code paths). The patch touched drivers/accel/habanalabs/gaudi and gaudi2 sources.
  • Preconditions for triggering: IOMMU enabled on the host, dma_alloc_coherent called with GFP_USER, and the allocator returning a CPU pointer in the vmalloc range; the user-space process then attempts to mmap that buffer without the kernel VMA being marked VM_MIXEDMAP.
  • Impact: Availability — kernel warnings, oops or panic. In shared multi-tenant hosts (cloud or CI runners), repeated or reliable triggering can create denial-of-service conditions. There is no canonical remote attack path — exploitation requires local or host-adjacent access to trigger the mapping path.
  • Confidentiality/integrity: the upstream patch description explicitly notes the memory remains driver-allocated and cannot be directly accessed by user space; the primary concern is host stability rather than information leakage. That reduces the immediate risk of data disclosure, though kernel instability in multi-tenant environments is operationally severe.
Caveat: public trackers and vendor advisories as of the disclosure do not report confirmed wild exploitation. Treat claims of in-the-wild exploitation as unverified unless vendors or telemetry channels confirm incidents.

Strengths of the upstream fix​

  • Surgical and minimal: The change is a small VMA-flag set that preserves existing driver semantics and does not redesign the memory or DMA interfaces. This small scope reduces regression risk when backporting into stable kernel trees.
  • Easy verification: The change is simple to audit (an is_vmalloc_addr check and vm_flags_set) and straightforward for downstream maintainers to map into distribution kernel packages. The stable-kernel review thread records the exact diff for backporters.
  • Restores correctness without enlarging user-space access: The fix does not alter which buffers are accessible from user space and keeps the memory driver-managed, mitigating the risk of enlarging the attack surface.

Potential residual risks and caveats​

  • Distribution and vendor lag: Not all distributions or embedded vendors ship immediate backports. Long-tail kernels and appliance images may remain vulnerable until vendor-specific updates are released and deployed. Operators should map upstream commit IDs to vendor changelogs to confirm presence of the fix.
  • Artifact-level variance: Kernel builds vary by CONFIG options and module lists. A vendor or cloud image that does not build the habanalabs driver into its kernel is not affected; conversely, Marketplace images, partner-provided kernels, or WSL custom kernels might include the driver unexpectedly. Operators must inspect the kernel config and installed modules in their artifacts to determine exposure. Practical commands: check uname -r, /boot/config-$(uname -r), lsmod/modinfo for habanalabs.
  • Operational detection challenges: The bug manifests as kernel oops traces referencing vm_insert_page/VM_PFNMAP or as a panic; such traces can appear like other VM mapping bugs. Effective detection requires correlating dmesg/journalctl kernel traces with recent device/ioctl activity involving accelerator device nodes.

Actionable guidance for administrators and engineers​

Immediate triage checklist​

  • Inventory artifact exposure:
  • Identify VMs and hosts that run kernels with habanalabs support. Inspect kernel configs for CONFIG_HABANALABS (or search for habanalabs/gaudi/gaudi2 in /lib/modules and /boot). If the driver is absent, the artifact is not affected.
  • Inspect kernel packages and vendor advisories:
  • Map your kernel package to distribution advisories (Debian/Ubuntu, SUSE, RHEL/Alma/Oracle) and check whether upstream stable commit IDs (the stable review entries) appear in the package changelog. If the kernel package lists the upstream stable commit or the CVE, the backport is present.
  • If you see kernel oops or panic traces:
  • Preserve dmesg and vmcore before rebooting. Capture kernel logs, backtraces, and the time of the event to correlate with device activity. These artifacts help vendors and maintainers triage and confirm a CVE-triggered crash.

Remediation and rollout​

  • Primary remediation: Install kernel packages that include the upstream stable commit(s) addressing the issue and reboot hosts into the patched kernel. Kernel fixes require a reboot to remove the vulnerable code from running memory.
  • If immediate patching is impossible:
  • Restrict access to device nodes used by Habana drivers (udev rules, group permissions).
  • Avoid exposing accelerator device nodes to untrusted containers or tenants.
  • Move sensitive workloads to patched hosts or isolate untrusted tenants until patches are applied.

Verification after patch​

  • Confirm the kernel package changelog references the upstream commit(s) or CVE.
  • Reproduce previously failing mapping flow in a controlled staging environment; it should not trigger kernel oops.
  • Monitor kernel logs for 7–14 days following the patch for recurrence or other side effects.

Developer and vendor considerations​

  • For driver maintainers: This bug is a reminder that dma_alloc_coherent’s behavior differs depending on IOMMU and GFP flags; explicit handling for vmalloc-backed allocations is necessary wherever mappings may be reified into process VMAs.
  • For distribution maintainers: Backport the minimal change rather than attempting larger reworks; the surgical nature of this patch is intended to ease stable-series backports and reduce regressions.
  • For cloud vendors and image publishers: Inventory your images and publish attestation artifacts (CSAF/VEX) mapping which artifacts include the habanalabs driver and whether they have been patched. Microsoft’s phased attestation approach shows the operational value of product-level mappings, but operators must verify the actual artifact content rather than assume absence from attestation equals absence in their stack.

Practical detection recipes​

  • Log patterns to hunt for:
  • Kernel oops lines involving vm_insert_page, VM_PFNMAP, or explicit BUG_ON tracebacks during mmap of accelerator buffers.
  • Device driver stack traces in dmesg rooted at drivers/accel/habanalabs/gaudi or gaudi2 functions.
  • Host-level commands to check the presence of the driver:
  • uname -r
  • grep -i habanalabs /boot/config-$(uname -r) or zgrep -i habanalabs /boot/config-*
  • lsmod | grep habanalabs
  • modinfo habanalabs (if the module exists)
  • If you run Marketplace or partner images: extract the image to inspect /lib/modules and /boot for evidence of habanalabs and kernel config flags.

Why this matters operationally​

Accelerator drivers increasingly appear in cloud and enterprise stacks as specialized hardware for AI workloads becomes mainstream. Small kernel correctness bugs that produce deterministic host-level crashes become high-impact in multi-tenant or orchestrated environments: a local crash in a shared host can disrupt many tenants, CI/CD pipelines, or automated workloads, and is operationally equivalent to denial-of-service. Even when the immediate vulnerability vector is local, the practical consequences in shared infrastructure are material. The habanalabs fix is not a flashy privilege-escalation mitigation; it’s exactly the kind of surgical correctness patch that prevents avoidable host outages in production environments.

Final assessment — strengths, risks, and closing recommendations​

  • Strengths:
  • The upstream fix is correct, minimal, and straightforward to backport.
  • The patch explicitly documents and enforces the VM_MIXEDMAP requirement for vmalloc-backed coherent mappings, eliminating the root cause of the BUG_ON.
  • Multiple independent trackers (NVD, OSV, stable-kernel review threads) and vendor advisories have recorded and distributed the patch, enabling straightforward remediation.
  • Residual risks:
  • Distribution and appliance lag means some environments will remain exposed until vendors publish backports.
  • Artifact variance (custom kernels, Marketplace images, WSL kernels) requires artifact-level verification rather than trusting a single vendor attestation.
  • Detection can be noisy: kernel oopses need careful correlation to prove CVE-triggered behavior.
  • Clear action items:
  • Inventory your deployed kernels and images for presence of habanalabs/gaudi code.
  • Apply vendor/distribution kernel updates that include the upstream stable commit(s) and reboot hosts.
  • If patching is delayed, restrict access to accelerator device nodes and isolate sensitive workloads.
  • Preserve crash logs for triage if you observe kernel oopses referencing VM_PFNMAP or vm_insert_page.
CVE-2025-40311 is a textbook kernel correctness fix: small, precise, and impactful for availability in production environments that expose accelerator drivers. Treat it as a high-priority stability update for hosts that include habanalabs support, verify vendor backports by commit id or package changelog, and patch and reboot in a controlled, staged manner.
Source: MSRC Security Update Guide - Microsoft Security Response Center
 

Back
Top