ARM64 MTE Patch Removes Spurious copy_highpage Warn CVE-2025-40353

  • Thread Author
The Linux kernel received a small but important patch that removes an unnecessary warning in the ARM64 MTE codepath when copy_highpage copies into a page that may already carry an MTE tag — a fix tracked as CVE-2025-40353 and already merged into the stable trees to prevent spurious WARNs during certain migrate/copy sequences.

Surgical patch removing a noisy diagnostic in the Linux kernel memory management path.Background​

Virtual memory correctness in the Linux kernel is full of tiny invariants and flags that must remain consistent across layers: VMA flags, page-table bits, and per-page metadata. On ARM64 platforms with Memory Tagging Extension (MTE) support, pages can carry tag-related flags (for example, the PG_mte_tagged internal page flag) that reflect whether the page’s memory carry hardware tags and whether software has marked the page accordingly. A recent upstream fix clarified one such invariant: copy_highpage should not emit a warning if it observes a page already marked as tagged because upstream migration changes can cause a legitimate re-copy to the same destination page. This behavior was reported via kernel testing infrastructure (syzbot) and discussed on the memory-management mailing list; the maintainer and author of the patch replaced a WARN_ON_ONCE with a comment to avoid noisy kernel diagnostics while preserving correctness. The change has been accepted and applied to the ARM64 tree and stable backports.

What changed technically​

The root cause (in plain language)​

  • The ARM64 copy_highpage implementation assumed the destination page would be newly allocated and therefore not already tagged (i.e., PG_mte_tagged unset).
  • Later upstream changes to folio migration introduced a sequence where a copy can be attempted, then the migration attempt (__folio_migrate_mapping) can fail with -EAGAIN, and the copy may be retried on the same destination page.
  • Because copy_highpage sets PG_mte_tagged during the first copy, a subsequent copy into the same destination page triggers WARN_ON_ONCE(page already tagged), producing a spurious warning even though the sequence is legitimate.
  • The upstream fix removes the WARN_ON_ONCE and replaces it with a clarifying comment, eliminating the false-positive warning while keeping semantics intact.

The patch itself, concisely​

  • Replace WARN_ON_ONCE(page already tagged) in arch/arm64/mm/copy_highpage (or equivalent ARM64 implementation) with a comment explaining why the retry path can legitimately be tagging an already-tagged page.
  • No change in semantics to how pages are tagged or how MTE is enforced — the patch only removes an unnecessary diagnostic that could be triggered during a valid retry path.
  • The change is small, localized, and explicitly targeted at reducing noisy kernel WARNs that complicate operations and debugging.

Why this matters: operational impact and real-world significance​

At first glance this is a cosmetic change — removing a warning — but it matters for a number of operational and engineering reasons:
  • Noise reduction in kernel logs. Spurious WARNs complicate monitoring and alerting. In cloud and multi-tenant environments, frequent benign WARNs can trigger incident handling workflows, causing wasted effort and masking real issues.
  • Avoid false positives during automated testing and fuzzing. Test harnesses (including syzbot) can produce many spurious warnings that obscure real bugs. Removing a false warning improves signal-to-noise for both automated and human triage.
  • Low regression risk. The fix is surgical (no behavior change to tagging or copy logic) and therefore easy to backport and verify.
  • No evidence of exploitation. This patch is a correctness/diagnostic change, not a memory-corruption fix or a new attack surface. Public vulnerability records and the OSV/NVD entries treat this as a resolved correctness issue; there is no known in-the-wild exploit tied to this CVE at publication.

Affected code and versions​

  • The change applies to the ARM64 MTE copy_highpage code path in upstream Linux kernel sources and has been merged into the ARM64 tree and the relevant stable backports.
  • The issue emerged after or in conjunction with commit 060913999d7a (the commit that adjusted migration sequencing so folio copies can precede mapping completion); the regression / warning surfaced when the copy-and-retry sequence hit the already-tagged destination. The upstream patch references and stable commit IDs were posted around the time the CVE was published.
Because kernel trees and vendor backport schedules vary, affected platforms are those that include the upstream commit(s) prior to the fix. Distributions and vendors typically publish which kernel package versions include the stable fix; operators should consult their vendor security trackers or package changelogs to verify remediation on their kernels.

Exploitability and severity: a pragmatic assessment​

  • Exploitability: Low/none. This CVE resolves a diagnostic warning that appeared under a legitimate retry path. The change does not remove checks that prevent unauthorized memory access nor does it alter the enforcement of MTE. There is no credible public scenario showing this leads to memory corruption or privilege escalation.
  • Impact: Primarily operational — reducing log noise and avoiding spurious WARNs that could otherwise trigger escalations or conceal real kernel problems.
  • CVSS / metrics: At the time of publication there were no high-severity CVSS vectors attached and no EPSS (exploit prediction) scores indicating active exploitation. Public vulnerability trackers record the fix as correctness-oriented and not exploitable remotely.
Caveat: while this change itself is low-risk from a security perspective, any kernel warning path can be significant in highly constrained production contexts. Frequent or unexpected WARNs can indicate timing or state races that, in the worst case when combined with other bugs, might contribute to instability. Treat this as a correctness patch with operational benefits rather than a critical security emergency.

What maintainers and operators did (and should do)​

Upstream maintainers​

  • Reviewed the syzbot report and mailing-list discussion.
  • Applied a focused patch on the ARM64 tree replacing the WARN_ON_ONCE with an explanatory comment.
  • Merged the change into applicable branches and folded the fix into stable backports where appropriate.

Distribution vendors and device OEMs​

  • Vendors will (and in many cases already have) mapped the upstream stable commits to their own kernel package versions and published advisories or package updates.
  • Embedded/OEM kernels may lag upstream and require manual backporting. These long-tail kernels are the typical area of residual exposure for any kernel-level fix. Operators relying on vendor-frozen kernels should open vendor support tickets if the fix has not been provided.

Practical mitigation and verification steps (for sysadmins and engineers)​

  • Identify affected hosts:
  • Run: uname -r across fleet to enumerate kernel versions.
  • Map installed kernels to vendor advisories or upstream commit ranges to see if a kernel includes the pre-fix code.
  • Apply vendor-supplied kernel updates:
  • Install kernel packages that explicitly reference the upstream stable commit or CVE-2025-40353 in the changelog.
  • Reboot hosts into the new kernel to activate changes.
  • For custom or in-house kernels:
  • Pull the upstream patch or cherry-pick the commit(s) from the ARM64 tree into your kernel branch.
  • Build, test, and roll out as per your kernel lifecycle processes.
  • Validate:
  • After patching and reboot, monitor kernel logs (dmesg / journalctl -k) for the specific WARN text that previously appeared: search for “page already tagged” or the diagnostic trace that syzbot reported.
  • Run representative migration and memory-copy workloads (or the syzbot reproducer if available in your test harness) to confirm the absence of the spurious WARN.
  • If you cannot patch immediately:
  • Increase log filtering sophistication: suppress a known benign WARN from your monitoring pipeline after careful verification, but only as a temporary measure.
  • Prioritize patching of multi-tenant hosts, CI runners, and shared infrastructure where spurious WARNs may cause operational churn.
These steps follow standard kernel remediation practice: apply vendor updates where available, backport upstream fixes for custom kernels, test in staging, and verify in production.

Detection and monitoring guidance​

  • Search kernel logs for the specific diagnostic string (the exact wording can vary by kernel version and build):
  • sudo journalctl -k | grep -i 'page already tagged'
  • sudo dmesg | grep -i 'mte'
  • Tune SIEM alerts to avoid flooding on a known benign message once you have confirmed the message is truly the same pre-fix warning.
  • Preserve crashlogs and dmesg outputs prior to reboot if you see unexpected WARNs — those artifacts help kernel maintainers and vendors correlate behavior to upstream commits during triage.
Because this CVE removes a warning, the principal detection signal is a reduction of false-positive WARNs after patching rather than detection of a new exploit technique.

Security analysis: strengths, limits, and residual risks​

Strengths of the upstream response​

  • Surgical fix: The patch makes the smallest possible change to remove a false-positive diagnostic without altering MTE enforcement or copy semantics.
  • Low regression risk: Minimal code churn and clear rationale make vendor backports straightforward.
  • Fast upstream acceptance: The patch was reviewed on the appropriate memory-management list and merged into the ARM64 tree and stable backports promptly.

Residual risks and operational caveats​

  • Long-tail kernels remain vulnerable to noise. Embedded devices and vendor forks that do not follow upstream stable backports can continue to emit the spurious WARN until a vendor-provided fix is applied.
  • Understating the broader picture. Although this CVE is a low-risk correctness fix, multiple small correctness issues in kernel subsystems can collectively complicate debugging and incident response in production systems; reducing noisy warnings is part of good operational hygiene.
  • Possible misclassification. If monitoring systems automatically escalate certain WARNs to emergency pages, then even a benign warning can produce costly operational impact; ensure alert rules distinguish between known benign diagnostics and true regressions.
Where claims about exploitation or security impact are ambiguous or absent, the public trackers and the upstream discussion are consistent: this is a correctness/diagnostic fix, not a vulnerability that directly enables memory corruption or privilege escalation. That conclusion is consistent across NVD/OSV and the kernel mailing-list patch thread.

Recommended timeline and prioritization​

  • High priority (within days): Multi-tenant hosts, hypervisors, CI runners, and any infrastructure that runs untrusted workloads. Spurious kernel WARNs in these environments can cause escalations, mask other issues, and complicate incident response.
  • Medium priority (1–2 weeks): Single-tenant servers and desktops used for development or testing — apply vendor kernel updates in a staged rollout.
  • Low priority (scheduled maintenance window): Non-critical embedded devices or appliances. For long-tail devices, open support tickets with OEMs; where possible, isolate these devices until a vendor fix is available.
This is not a critical-security emergency; prioritize by operational exposure rather than an arbitrary CVSS value.

Conclusion​

CVE-2025-40353 is a textbook example of a surgical upstream correction: it removes a misleading WARN that occurred when copy_highpage retried a legitimate copy into an already-tagged destination page on ARM64 MTE platforms. The fix reduces log noise, lowers the burden on monitoring and incident-response teams, and is safe and low-risk to backport.
Administrators should verify that their kernels include the upstream stable commit(s) addressing this change, apply vendor-supplied updates or backport the patch for custom kernels, and validate that the spurious WARN no longer appears under representative workloads. While not a security emergency, this change improves overall system robustness — and in complex, multi-tenant environments even small reductions in diagnostic noise produce outsized operational benefits.
Source: MSRC Security Update Guide - Microsoft Security Response Center
 

Back
Top