CVE-2024-35853: Mellanox mlxsw ACL TCAM memory leak and patch guidance

  • Thread Author
A subtle defect in Mellanox's mlxsw Spectrum ACL TCAM code — tracked as CVE-2024-35853 — can leak kernel memory during the driver’s background “rehash” work, permitting attackers to gradually exhaust system resources and produce sustained or persistent denial-of-service conditions on affected Linux hosts; the bug has been fixed in upstream kernel patches and is addressed in vendor kernel updates, but operators should treat this as an operationally important availability risk until all affected systems are patched.

A server rack shows ACL TCAM blocks and a glowing REHASH gear amid warning signs.Background / Overview​

The mlxsw driver family supports Mellanox/NVIDIA Spectrum switch silicon and offload-capable NICs that implement hardware ACLs (TCAM-based filters) for high-performance packet classification. The Spectrum ACL implementation in the Linux kernel organizes filter entries in regions and chunks, and it performs background maintenance — notably rehash work — to migrate entries between regions as usage patterns change or hardware constraints mandate rebalancing.
The rehash mechanism is asynchronous: a worker routine iterates chunks, migrates filters, and cleans up the old region when migration completes. That deferred work model avoids long pauses in the data plane, but it requires careful synchronization because multiple worker paths (for migration, activity updates, and region teardown) may operate concurrently. The CVE-2024-35853 defect arises in the rehash worker’s migration and rollback code paths and leads to memory that should be released not being freed — in other words, a kernel memory leak that can accumulate under repeated triggers of the rehash path.
Why this matters to administrators: leaks that occur inside the kernel are harder to reclaim and can, over time or with repeated triggering, force the kernel to invoke the out-of-memory killer, induce performance collapse, or create persistent failure modes in packet processing on network hosts and appliances that rely on the affected driver. The vendor and CVE records classify the issue as medium severity but explicitly note the availability impact (A:H in the NVD CVSS vector).

What the bug actually does — technical breakdown​

Data structures and the rehash flow​

  • The ACL code stores rules in virtual chunks (vchunks). Each vchunk maintains pointers to two chunk objects: vchunk->chunk (the currently used chunk) and vchunk->chunk2 (a backup reference used during migration). During a migration the code swaps these to point at the source and destination chunk objects as filters are moved.
  • The rehash delayed work iterates over regions and chunks, migrates filters from one chunk/region to another, and when the worker decides migration is done it will tear down the old region (and free associated resources). The migration code has complex rollback behaviour: if a migration attempt fails, the code tries to move filters back; if rollback itself fails, the worker may attempt subsequent migrations. That “ping‑pong” of migration/rollback complicates assumptions about which pointers and helpers are live at any time.

Where the leak appears​

  • The code assumed that if a vchunk’s current chunk did not reference the target region, then the backup (chunk2) would not exist (i.e., it would be NULL). That assumption is incorrect when migrations and rollbacks interleave or fail. In those failure sequences hints and auxiliary objects allocated for migration work can be left referenced by pending worker items or can simply be orphaned without being freed. The result: memory that should have been freed remains allocated and accumulates.
  • The leak can occur in at least two closely related ways that were recorded as separate tracking items by distributors and the kernel team: (a) a memory leak produced when migration/rollback ping‑pong leaves chunk2 live unexpectedly (CVE-2024-35853), and (b) additional leak paths when rehash work is canceled or rescheduled leaving allocated "hints" associated with pending work (CVE-2024-35852). These defects were fixed in a small cluster of upstream commits that adjust the rehash teardown and rollback logic and that explicitly free hints when work is canceled.

Evidence in the kernel tree and KASAN traces​

  • Public CVE descriptions and distro advisories include KASAN-style backtraces and diagnostic output showing unreferenced objects and slab-use traces during the rehash work path; those traces were used by kernel developers to reproduce the memory leak and to validate the upstream fixes. Administrators should look for similar messages in dmesg or journal logs when triaging suspected infections: unreferenced object, kasan or worker-thread stack traces referencing mlxsw_sp_acl_tcam_vregion_rehash_work are strong indicators.

Impact and exploitability — what operators need to know​

Availability impact (real and concrete)​

  • The vulnerability’s explicit impact is loss of availability: memory allocated by the kernel driver is not released, and repeated triggering of the rehash path can exhaust kernel memory. NVD and vendor advisories classify the availability impact as significant: full denial-of-service is possible when memory exhaustion leads to OOM, kernel crashes, or persistent resource starvation. That is consistent with practical attack narratives where small repeated leaks accumulate into a complete service failure.
  • Unlike immediate-crash bugs (use-after-free leading to kernel panic), a leak is often slow-burning and can be exploited to create long-running outages that survive momentary pauses in attacker activity — the condition can either be sustained (while the attacker continues to trigger the leak) or persistent (the system remains impaired even after the attack ends because resources were irreversibly lost). This is the type of availability loss many operators find harder to diagnose because the system does not necessarily produce a single, obvious error event.

Exploitability and attacker model​

  • Public reporting does not include an active exploit or public proof-of-concept at the time of disclosure. That said, the code path in question is triggered by ACL rehashing operations; whether an unauthenticated remote attacker can reliably drive the vulnerable code depends on who can create or modify TC/ACL entries that cause the device to rehash. In many environments ACL programming requires administrative privileges or local control of the host’s networking stack. Some CVE records characterize the vector as network-accessible (AV:N) while other vendor entries treat it as local (AV:L); the inconsistent classification reflects ambiguity in practical exploit scenarios and differences in distribution scoring. Administrators should therefore assume potential remote impact in environments where network configuration APIs are accessible or where unprivileged users can influence offloaded ACLs.
  • The CVSS vectors published by different databases vary; NVD uses CVSSv3: AV:N/AC:H/PR:L/UI:N/S:U/C:L/I:L/A:H (medium, with network vector and high attack complexity), while some distributor advisories treat the issue as local. Because those distinctions matter operationally, teams must evaluate exposure in the context of their own provisioning, orchestration, and control-plane interfaces. Treat the discrepancy as an indicator that threat modeling is required for each deployment.

How the issue was fixed (and what the patches change)​

  • Upstream kernel commits adjust the rehash worker and teardown logic so that:
  • the code no longer assumes the backup chunk pointer is NULL in rollback sequences; it verifies pointer state before freeing or reusing objects;
  • the teardown path no longer destroys regions if migration failed (avoiding dangling references and use-after-free scenarios);
  • canceled rehash work frees any hints and auxiliary allocations associated with pending worker items, preventing leaks when work is canceled or rescheduled.
  • The fixes were small, surgical changes in the ACL TCAM logic and were landed in the kernel stable trees; multiple distributions rolled those commits into vendor kernel releases. Operators should rely on vendor-supplied kernels rather than attempting to cherry-pick or rebase patches themselves in production.

Detection, monitoring, and triage guidance​

What to watch for​

  • Kernel logs (dmesg / journalctl) showing KASAN or slab allocation messages referencing mlxsw or acl/tcam worker threads:
  • lines containing kasan, slab-use-after-free, unreferenced object, or stack traces with mlxsw_sp_acl_tcam_vregion_rehash_work or mlxsw_sp_acl_tcam_region_destroy. Those messages mirror the diagnostic traces published with the CVE and are immediate indicators of a problem in the rehash path.
  • Gradual increase in kernel memory usage on hosts using the mlxsw driver, accompanied by degraded packet processing, higher CPU usage from kworker threads, or out-of-memory (OOM) killer invocations directed at userland processes (an indirect sign the kernel is under memory pressure). A slow-but-steady memory trendline correlated with network configuration churn is an operational red flag.

Practical triage steps​

  • Check dmesg and journalctl for KASAN/worker traces that reference mlxsw or spectrum ACL functions. If present, collect a full kernel log and open a support case with your distribution/vendor.
  • Correlate kernel memory usage trends (free, slabinfo, and top output) against recent ACL or tc configuration changes; look for kworker threads with sustained CPU and memory allocations.
  • If you must stop an ongoing leak immediately and cannot apply a vendor kernel update right away, consider isolating the affected host from production traffic, restricting management plane access, or unloading the mlxsw module if your deployment permits it — but be aware that unloading mlxsw can disrupt switching/forwarding and may not be permitted on all platforms. Any such mitigations are environment-specific and should be validated against vendor guidance before application.

Patching and operational mitigations​

Primary remediation: install vendor kernel updates​

  • Vendor packages and advisories confirm fixes were included in stable kernel updates and in distribution-specific advisories. Examples include Amazon Linux ALAS advisories, Ubuntu USNs, Red Hat security updates, SUSE advisories, and stable kernel tree commits that landed the fixes. Operators should install the vendor-provided kernel update that corresponds to their distribution and kernel line, and reboot hosts on a maintenance window to move to the patched kernel.
  • Where a distribution provides a fixed kernel package, rely on that package (for example, Red Hat’s RHSA updates, Ubuntu kernel security updates, Amazon Linux security errata). RHEL packaging guidance published via Snyk and LWN identifies fixed build versions for RHEL 9 kernels; consult your vendor advisory for exact package names and minimum fixed versions before upgrading.

Temporary mitigations (if patches cannot be applied immediately)​

  • Restrict the network management and control plane: limit who can program ACLs and offloaded TCAM rules. If only privileged administrators can alter ACLs today, enforce least privilege and limit access to the network management plane to reduce the chance an attacker can trigger repeated rehashes. This is a risk-reduction measure pending patching, not a replacement for updates.
  • Disable or limit hardware ACL offload in use-cases where the mlxsw driver is not essential to data-plane performance. This may be practical for some server roles but disruptive for switch devices; exercise caution and test before rolling out. Vendor documentation should be consulted for safe driver/module configuration.
  • If the system is in a lab or test environment and you can safely unload the mlxsw kernel module, that will stop the problematic code path. In production, unloading mlxsw is likely to break networking and is therefore only a stopgap and emergency action. Document any such interventions for follow-up.

Vendor and community context​

  • Upstream kernel commits for the related fixes were authored and landed in April–May 2024; distribution vendors moved the fixes into stable kernel releases and advisories over the following weeks. Multiple related CVE identifiers (CVE-2024-35852, CVE-2024-35853, CVE-2024-35854, CVE-2024-35855) were published because the rehash code exhibited several distinct but related memory and synchronization defects; the collection of fixes together closes the broader class of rehash teardown and activity‑update races.
  • Community and security-forum analyses echoed the kernel diagnostics and the fix rationale: posts summarizing the use-after-free and leak patterns in the mlxsw ACL code appeared in public Linux security threads as part of standard post-patch discussion. Those community write-ups are useful for engineers who want an operational account of triggering conditions and log evidence when triaging incidents.

Prioritized checklist for sysadmins and SRE teams​

  • Inventory: locate all hosts with Mellanox/NVIDIA Spectrum adapters or appliances that use the mlxsw driver, and identify their kernel version and distribution kernel package. Use your asset management system and package-manager queries.
  • Patch: apply vendor kernel updates that include the mlxsw ACL fixes; reboot windows as required. Prioritize edge routers, bare-metal NIC hosts, network appliances, and any system that actively programs TCAC/ACL offload rules.
  • Monitor: enable and centralize collection of kernel logs; create alerts for KASAN messages, slab/alloc warnings, worker-thread traces mentioning mlxsw and for gradual kernel memory growth trends.
  • Harden: restrict access to interfaces that can program ACLs and TC filters; require authentication and least privilege for network management APIs. Consider network segmentation for management-plane traffic.
  • Validate: after patching, run configuration-change exercises that trigger rehash work under controlled load and monitor for any residual warnings in the kernel logs. If problems persist, open a vendor support case with logs and reproducible steps.

Risks, open questions, and guidance for risk-tolerant environments​

  • The CVE disclosures and vendor advisories do not show evidence of active exploitation in the wild at the time of publication, and no public proof-of-concept code has been widely circulated. That reduces near-term urgency for threat containment compared with an actively exploited RCE, but it does not eliminate the operational risk: a resource-exhaustion DoS against core network hosts is a credible attack vector for an opportunistic adversary. Monitor your threat feeds and treat this as a higher-priority availability patch for network-facing infrastructure.
  • There is some inconsistency in CVSS vector assignments among vendors and databases (network vs local vector differences). This inconsistency reflects legitimate uncertainty about whether the vulnerable code path is reachable without local privilege in every deployment. Therefore, security teams must make a local, environment-specific determination of exposure: if network APIs, orchestration tools, or multi-tenant control planes can program ACLs or otherwise cause rehash activity, treat the host as exposed. If only privileged administrators can change filters and management access is tightly controlled, the exposure is different but not zero.
  • If your environment depends on vendor devices where kernel updates are not available or are delayed (for example, embedded appliances or vendor-provided firmware images), engage your vendor for a remediation timeline or support workaround; do not attempt to run an unsupported kernel on production appliances without explicit vendor guidance.

Conclusion​

CVE-2024-35853 is a pragmatic, real-world reminder that asynchronous background maintenance in kernel drivers — especially code that migrates state between hardware-offloaded regions — can create fragile assumptions and surprising resource leaks. The flaw is not a dramatic remote code execution worm-style risk, but it is a concrete availability hazard for hosts that use mlxsw Spectrum ACL offload. Kernel developers pushed targeted fixes that correct pointer assumptions, free orphaned hints, and harden teardown and rollback paths. Operators should apply vendor kernel updates without delay, monitor kernels for KASAN/slab traces and memory trends, and limit who can program ACLs in their environment while updates are rolled out. The combination of patching, detection, and management-plane controls will close the gap between discovery and operational mitigation.

Source: MSRC Security Update Guide - Microsoft Security Response Center
 

Back
Top