Ubuntu Extends Lead in Zen‑5 CPU Tests: Ryzen 9950X vs 9950X3D on Linux and Windows

  • Thread Author
Phoronix’s side‑by‑side testing of the AMD Ryzen 9 9950X and the 9950X3D across Windows 11 (25H2) and Ubuntu Linux snapshots delivers a clear, actionable snapshot: for heavy, CPU‑bound creator workloads on Zen‑5 silicon, a modern Linux kernel and toolchain still tend to extract more throughput out of this hardware than a stock Windows 11 25H2 installation, while running Ubuntu inside WSL2 on Windows carries a measurable penalty versus bare‑metal Linux.

Tech infographic comparing Linux and Windows benchmarks, highlighting 9950X CPUs and cross-platform render stats.Background​

The test series compares two closely related 16‑core Zen‑5 desktop parts: the AMD Ryzen 9 9950X (non‑3D) and the Ryzen 9 9950X3D, which adds 3D V‑Cache to increase last‑level cache capacity for workloads that are sensitive to latency and cache residency. The hardware platform was kept identical across runs: an ASRock X870E Taichi motherboard, 2 × 16 GB DDR5‑6000 memory, AMD Radeon RX 9070 GPU present (but not the bottleneck for CPU tests), and a 1 TB Crucial T705 NVMe SSD. Tests were executed on Windows 11 25H2 with updates and on Ubuntu — both Ubuntu 24.04.3 LTS (with SRUs applied) and a development snapshot of Ubuntu 25.10 running the Linux 6.17 series kernel to capture leading‑edge kernel behavior.
Phoronix intentionally focused the suite on CPU‑heavy, cross‑platform workloads rather than gaming, choosing tasks that reveal scheduler, frequency scaling, and compiler/toolchain effects: Blender CPU renders, multi‑threaded encoders, compression, denoisers, scientific/synthetic kernels, and other producer‑oriented workloads. Where possible, the same binaries or comparable cross‑platform builds were used to minimize build‑time bias. Clean installs with default OS power/performance settings were used to reflect the experience an enthusiast or creator would encounter out of the box.

What the numbers show — headline takeaways​

  • Ubuntu 25.10 daily snapshots often outperformed Windows 11 25H2 across this test set. On the geometric mean of the cross‑platform CPU workloads, Phoronix reported roughly a ~15% lead for the modern Ubuntu snapshot in this specific test profile.
  • Windows 11 25H2 produced no meaningful throughput improvement over Windows 11 24H2 in these same runs; the 25H2 build was being distributed as an enablement package, not a kernel rewrite, which explains why it didn’t materially change scheduler or core‑scaling behavior. The lack of a large OS‑side uplift aligns with Microsoft’s published delivery model for 25H2.
  • Running Ubuntu inside WSL2 on Windows 11 25H2 is convenient but not free: the tested Ubuntu 24.04 image under WSL2 delivered about 87% of the throughput of a native Ubuntu 24.04.3 LTS install on the same hardware (roughly a 13% composite penalty), with the gap widening substantially for I/O‑heavy workflows.
  • The chip‑level difference between the 9950X and 9950X3D is workload‑dependent: 3D V‑Cache provides clear advantages for cache‑sensitive workloads and certain game/latency scenarios, but across long‑running, heavily multi‑threaded rendering and encoding jobs the OS scheduler, kernel, and toolchain behavior often dominate the observed wall‑clock time gains. In the Phoronix snapshot, OS selection (and kernel/toolchain recency) frequently produced larger deltas than the chip variant alone. fileciteturn0file4turn0file18

Why Ubuntu edged Windows on Zen‑5 in these runs​

Kernel and scheduler timeliness​

Linux distributions — particularly development snapshots like Ubuntu 25.10 — frequently ship newer upstream kernels sooner than comparable Windows low‑level changes arrive. Those kernels (6.16/6.17 in Ubuntu 25.10 snapshots) include scheduler and power‑management improvements that can better scale Zen‑5 cores in high‑thread‑count loads. That matters for long, sustained jobs where thread placement, core affinity, and frequency behavior compound into measurable time savings.

Toolchain and compiler effects​

Where code is sensitive to vectorization and code generation, the compiler used to build the workload matters. Ubuntu 25.10 snapshots were shipping newer toolchains (GCC 15 family, newer glibc builds in some test configurations) that can produce tighter code for modern microarchitectures. When cross‑platform binaries are not identical, Linux builds in these snapshots sometimes delivered better vectorized throughput in the tested renderers and encoders.

Leaner defaults for heavy compute​

Out‑of‑the‑box Linux installs typically run fewer background services and telemetry processes than default consumer Windows installs. For CPU‑bound workloads that run for minutes or hours, fewer background tasks translate into more predictable CPU availability and less noise in measurements. Phoronix intentionally used stock settings to reflect the experience most users will encounter, and in that real‑world posture Linux’s lean defaults gave it an edge in aggregate throughput for many tests.

Ryzen 9 9950X vs 9950X3D — where 3D V‑Cache helps and where it doesn’t​

The architectural case for 3D V‑Cache​

The 9950X3D adds stacked L3 capacity to increase cache residency for hot data sets. In workloads where working sets fit or partly fit into a much larger last‑level cache — certain game engines, database hotspots, and some latency‑sensitive kernels — the 3D V‑Cache variant can produce meaningful gains through fewer cache misses and reduced main‑memory traffic.

Observed behavior in Phoronix’s runs​

Phoronix’s cross‑platform suite shows mixed results for the CPU variant difference because the suite intentionally stresses both cache‑sensitive and throughput‑sensitive patterns. In many long, highly parallel rendering and batch encoder jobs, the overall throughput advantage tended to track OS and kernel/toolchain differences more than the additional cache alone. In short: the 9950X3D is not a universal win for every CPU job; its benefits are contingent on whether the workload is L3‑bound and whether the OS+compiler make efficient use of the cache improvements. fileciteturn0file0turn0file4

Practical illustration (non‑numeric summary)​

  • Cache‑bound, low‑latency kernels and some game engines: 9950X3D likely benefits, sometimes substantially.
  • Very high‑thread‑count, sustained throughput renderers/encoders: OS scheduler, power policy, and toolchain may determine more of the result than the extra L3 alone.
    Because Phoronix’s reporting focused on a broad cross‑section of producer workloads, the net effect of the 3D cache in aggregate was smaller than the net OS effect for this testbed. Readers should inspect per‑test charts for specific workloads they care about; the per‑test variance is where the real decision signal lives.

WSL2: convenience vs. native throughput​

The bottom line​

If your daily workflow requires a Linux environment but you’re constrained to Windows, WSL2 is an excellent productivity tool — but it is not equivalent to bare‑metal Linux for throughput‑sensitive production workloads. Phoronix found WSL2 delivered about 87% of native Linux throughput on this Ryzen testbed; the performance delta is workload‑dependent and particularly pronounced for filesystem‑heavy tasks.

Why WSL2 loses ground​

  • Filesystem and I/O arbitration: cross‑filesystem access (Windows files accessed from the Linux guest) incurs extra cost compared to native ext4 root performance. Database workloads and heavy file‑IO patterns suffered the most.
  • Interposition by host services: Windows Defender, antivirus and integration layers can add latency to file operations and process interactions when using mounted Windows volumes.
  • VM boundary overhead: WSL2 runs a lightweight utility VM; while CPU compute can approach native performance in many pure compute kernels, mixed IO/CPU workloads reveal the VM boundary costs.

Practical WSL2 mitigations​

  • Keep builds, repositories and heavy working sets inside the WSL2 VM’s native filesystem (ext4) instead of on mounted Windows drives.
  • If company policy allows, whitelist WSL2 directories from real‑time AV scanning; otherwise expect measurable I/O slowdowns.
  • For CI or nightly large builds, prefer native Linux runners or dedicated Linux VMs rather than relying on developer workstations.

Strengths, caveats, and risk analysis​

Strengths of Phoronix’s approach​

  • Transparent, reproducible methodology using OpenBenchmarking logs and clean installs gives readers the trail to reproduce the runs.
  • Focus on cross‑platform, CPU‑heavy workloads is the correct lens to expose scheduler and toolchain effects on modern multi‑core silicon.
  • The use of both LTS and leading‑edge Ubuntu snapshots (24.04.3 and 25.10 with Linux 6.17) shows the range of outcomes users may see across distribution cadences. fileciteturn0file9turn0file14

Important caveats​

  • These are first‑look snapshot results: Ubuntu 25.10 was still a development stream and Windows 25H2 was a preview enablement package at test time. Final shipping releases, driver updates, BIOS microcode, and firmware revisions can change the balance. Treat the numbers as directional, not definitive.
  • Benchmarks are workload specific: a 5–15% geomean advantage for Linux in one test mix doesn’t mean Linux will be faster for every single application or user. Single‑threaded or Windows‑native drivers/apps can reverse the trend for specific tasks.
  • Vendor drivers and proprietary stacks matter: where workflows rely on proprietary GPU drivers, specialized accelerators, or vendor‑certified toolchains, Windows may remain the safer path despite raw throughput differences.

Risks for adopters​

  • Enterprises with vendor‑certified pipelines may be unable to switch OSes even if raw performance favors Linux. Certification, management and security policy constraints add real friction.
  • Small driver or microcode updates can flip individual test results; procurement and capacity planning should rely on repeated, reproducible tests rather than single snapshots.

Practical recommendations — how to decide and what to test​

If you manage hardware or build workstations, follow a short, pragmatic decision path:
  • Identify representative workloads (renderers, encoders, compilers) and create a reproducible test script that exercises the same data and build options across OSes.
  • Run matched hardware tests: same CPU model (9950X vs 9950X3D), same memory, same SSD, same GPU presence. Capture BIOS/UEFI, microcode, kernel/driver versions, and compiler/toolchain versions.
  • For developer convenience: measure WSL2 runs and native Linux runs for the same workloads; if you rely on heavy IO, plan CI on Linux runners.
  • If you’re evaluating the 9950X3D specifically: include cache‑sensitive microbenchmarks and game engine tests in addition to long render and encoder runs to measure where the 3D cache yields the largest gains.
  • Re‑test after firmware/driver updates and before large procurement decisions — small updates can materially alter results.

Tuning checklist — practical steps to narrow gaps or optimize throughput​

  • Place build artifacts on native Linux filesystems (ext4/XFS) rather than mounted Windows volumes.
  • Use the most recent, stable Linux kernel available for your distro if throughput is critical; test a maintenance channel and a leading‑edge kernel to understand sensitivity.
  • Match compiler flags between platforms where possible (e.g., enable identical vectorization and optimization levels).
  • For Windows: disable unnecessary background services for benchmarking and confirm virtualization‑based security settings’ impact on throughput.
  • Keep vendor GPU drivers and firmware updated; run tests before and after driver updates to measure impact. fileciteturn0file2turn0file6

Final analysis and conclusion​

Phoronix’s cross‑platform snapshot on a matched Ryzen 9 9950X/9950X3D testbed is a useful, reproducible data point for workstation builders and creators: modern Ubuntu snapshots with newer kernels and toolchains continue to show measurable advantages in heavily parallel, CPU‑bound workloads on Zen‑5 silicon, and the convenience of WSL2 involves a non‑trivial throughput trade‑off for I/O‑heavy tasks. The impact of 3D V‑Cache remains workload‑dependent: it shines in cache‑bound, latency‑sensitive scenarios but can be less decisive when OS‑level scheduling and compiler differences dominate long‑running multi‑threaded throughput. fileciteturn0file9turn0file8
For readers making real‑world choices: treat these results as an invitation to test, not as a final verdict. Reproduce the small set of representative workloads you actually use, capture raw logs and toolchain versions, and base procurement or OS migration decisions on reproducible measurements run in your environment. Small percentage gains compound over time in rendering farms and CI fleets; they are worth validating empirically. And when convenience (Windows + WSL2) is essential, architect your pipelines so that heavy I/O and production CI run on native Linux where throughput matters most.
Caveat: the Phoronix runs were a first‑look snapshot using preview or development builds of the operating systems and therefore reflect a point‑in‑time measurement. Final release builds, driver updates, BIOS/microcode revisions, or alternative workload mixes may produce different results; those outcomes should be validated within each organization’s operating envelope.

In short: if maximum CPU throughput for parallel producer workloads is your top priority, pilot a modern Linux kernel (Ubuntu 25.10 or a tuned LTS kernel) on matching hardware and compare real runs; if compatibility with Windows‑only tools or certified stacks is required, Windows 11 remains the pragmatic choice — but expect that, for Zen‑5 multi‑threaded workloads, Linux’s current toolchain and kernel cadence can and often does turn a measurable advantage into wall‑clock time savings. fileciteturn0file9turn0file17

Source: Phoronix AMD Ryzen 9 9950X vs. 9950X3D On Windows 11 & Ubuntu Linux - Phoronix
 

Back
Top