Linux Leads CPU Throughput in Phoronix Tests vs Windows 25H2 on Ryzen 9950X

  • Thread Author
Early independent testing shows that Windows 11 version 25H2 delivers no measurable CPU throughput gains over 24H2 while modern Linux snapshots continue to outperform Windows on multi‑threaded creator workloads — in Phoronix’s Ryzen 9 9950X testbed Ubuntu 25.10 averaged roughly a 15% advantage across 41 CPU‑focused benchmarks, and running Ubuntu under WSL2 on Windows 11 produced a measurable ~13% composite penalty versus native Linux.

Desktop PC with Ubuntu 25.10 benchmark graphs on the monitor and a glass-case rig with RGB-lit internals.Background / Overview​

Windows 11 25H2 is being distributed as an enablement package (eKB) that flips features already delivered via monthly cumulative updates to enabled state on systems running 24H2. That delivery model intentionally avoids a full binary rebase and therefore is not expected to produce sweeping kernel‑ or scheduler‑level changes out of the box.
Independent benchmarking that focused on CPU‑bound creator workloads — renderers, encoders, denoisers and other multi‑threaded jobs — compared Windows 11 25H2 (preview builds) with Windows 11 24H2 and Ubuntu variants (24.04.3 LTS and 25.10 daily snapshots). The headline results: Windows 25H2 essentially matched 24H2, and Ubuntu snapshots frequently led the suite, often by double‑digit percentages on geometric‑mean composites.
This article dissects those findings, explains why the deltas appear, evaluates methodology and practical implications for enthusiasts, creators, and administrators, and offers an actionable testing and migration checklist for teams that depend on raw throughput.

What the tests actually measured​

Test hardware and software profile​

  • CPU: AMD Ryzen 9 9950X (16 cores / 32 threads). The processor is a Zen 5‑class desktop part with typical specs of 16 cores, 32 threads, and a maximum boost up to ~5.7 GHz on stock settings.
  • Memory and storage: High‑end DDR5 and PCIe Gen5 NVMe media were used to avoid I/O bottlenecks in CPU‑heavy tests. Test articles list 32 GB DDR5 and a Crucial T705 NVMe drive as the platform’s storage.
  • OS builds: Windows 11 25H2 preview / release‑preview builds, Windows 11 24H2, Ubuntu 25.10 daily snapshots (Linux 6.16/6.17 series kernels depending on the snapshot), and Ubuntu 24.04.3 LTS.
  • Workload focus: 41 cross‑platform CPU‑bound tests including Blender CPU renders, LuxCoreRender, Embree, OSPRay, Intel OIDN, IndigoBench, various encoders and compressors — deliberately chosen to emphasize scheduler, frequency scaling, and toolchain effects rather than gaming or GPU‑bound scenarios.

Key numerical takeaways​

  • Ubuntu 25.10 (development snapshot in these runs) achieved about a ~15% lead over Windows 11 25H2 on the geometric mean for the selected CPU workloads on this Ryzen testbed.
  • Windows 11 25H2 showed effectively no net throughput improvement compared with Windows 11 24H2 across the measured suite; the enablement package approach explains the lack of a system‑wide uplift.
  • Running Ubuntu inside WSL2 on Windows 11 25H2 delivered roughly 87% of native Ubuntu’s throughput in the same hardware context — a ~13% composite penalty, with I/O‑heavy workloads showing the largest gaps.

Why Linux led in these CPU workloads​

1) Kernel and scheduler recency​

Development snapshots of Ubuntu often ship newer upstream kernels sooner than comparably visible low‑level changes in Windows. Kernel revisions in the Linux 6.16/6.17 timeframe included scheduler and power‑management improvements that can improve scaling on high‑core‑count Zen‑5 silicon, especially for sustained multi‑threaded jobs where thread placement and frequency behaviour compound over long runtimes.

2) Toolchain and build differences​

Many producer workloads are sensitive to compiler vectorization and code generation. Linux toolchains (GCC/Clang) in current snapshots may produce binaries with different optimization profiles than Windows builds using MSVC or other runtime choices. When Phoronix used native cross‑platform builds where possible, compiler differences sometimes shifted hot‑path performance in Linux’s favor.

3) Leaner default userland and lower noise​

Out‑of‑the‑box Linux installs (especially daily snapshots or performance‑oriented distros) typically run fewer background services and telemetry processes compared with a stock consumer Windows image. For long‑running CPU jobs, that reduced “system noise” results in more predictable CPU availability and measurable throughput gains.

4) Filesystem/IO stack and WSL2 architecture effects​

WSL2 runs a Linux environment inside a lightweight utility VM and uses a different I/O surface than native ext4 roots. The largest WSL2 penalties were I/O related: large compile jobs, database operations and file‑heavy build steps suffered most when hosted in WSL2 using Windows‑side mounts. That’s why the composite WSL2 result was ~87% of native in these particular runs.

What these numbers do — and don’t — prove​

Strengths of the testing approach​

  • The test authors used identical hardware across runs and prioritized native binaries where possible to reduce toolchain bias. That makes observed deltas meaningful for the stated scenario: out‑of‑the‑box CPU throughput for creator workloads.
  • Geometric means across many long‑running jobs reduce the impact of single‑test noise and provide a fairer aggregate of sustained throughput differences.

Important caveats and limitations​

  • These results are workload‑specific. The tested suite emphasized CPU‑bound producer tasks; gaming and many GPU‑accelerated professional apps were deliberately excluded and frequently favor Windows due to DirectX and finer vendor GPU driver tuning. Extrapolating the CPU result to “overall platform superiority” is incorrect.
  • Preview and snapshot builds shift: the Windows 25H2 numbers were gathered on preview / release‑preview builds and Ubuntu numbers came from daily snapshots. Final GA builds, ship‑level driver updates, or vendor firmware microcode revisions can change marginal outcomes later. Treat the numbers as a snapshot, not a permanent verdict.
  • Toolchain parity is imperfect: where identical cross‑platform binaries are unavailable, build differences can amplify or suppress platform advantages. The best cross‑platform comparisons use identical artifacts where practical.

The practical implications​

For creators and content‑production pipelines​

  • If raw multi‑threaded throughput is the priority and your toolchain and software stack are supported, Linux (or a performance‑oriented distro) can reduce wall‑clock times on batch rendering and encoding jobs. The ~5–15% turnaround improvements reported in these runs can directly translate to hours saved on long projects.
  • For Windows‑only production apps (certain Adobe workflows, proprietary capture/codec toolchains), Windows remains the required platform. In such hybrid setups, one pragmatic pattern is to keep Windows workstations for interactive work and offload batch rendering/compile farms to Linux servers.

For developers and sysadmins​

  • WSL2 is extremely convenient for dev workflows and “good enough” for local testing, but it’s not a drop‑in replacement for native Linux for I/O‑heavy production builds. Expect a measurable penalty (Phoronix’s runs put it at ~13% on the composite) and test actual workloads before adopting WSL2 for high‑throughput pipelines.
  • 25H2’s enablement‑package model makes the update low‑risk from an operational perspective, but administrators should still inventory legacy tooling such as PowerShell 2.0 and WMIC before broad rollout because 25H2 removes or deprecates certain legacy pieces.

For gamers​

  • These CPU‑focused tests do not contradict the prevailing ecosystem reality: Windows remains the better gaming platform because of DirectX, Proton’s maturity constraints, and vendor driver prioritization on Windows. GPU and gaming tests are a separate battleground.

Deep dive: Windows 11 LTSC and the “clean Windows” hypothesis​

A recurring forum claim is that a “clean” Windows install such as Windows 11 IoT LTSC (or other LTSC variants) will outperform consumer Windows because it trims bloat and background services. Practical reality is nuanced:
  • Reducing background services and telemetry can indeed reclaim CPU and memory headroom for constrained systems, and LTSC images can be slightly friendlier on low‑resource machines or when storage is slow. That said, LTSC shares the same kernel and scheduler behavior as retail Windows builds, and the deeper scheduling and core‑scaling differences seen against Linux are unlikely to be resolved merely by removing consumer features. In short: LTSC can help for specific constrained scenarios, but it’s not a silver bullet for the multi‑threaded throughput deltas measured here.
Forum commentary captured the practical tradeoff bluntly: LTSC may be faster for a few targeted jobs or in RAM/storage‑constrained cases, but it suffers the same scheduling characteristics and support tradeoffs as broader Windows variants, and it is not a universal performance fix. That view aligns with the independent testing outcomes: the significant deltas stem from kernel/scheduler/toolchain and I/O stack differences — not merely userland background services.

Testing and migration checklist (practical steps)​

  • Build a representative test set
  • Identify 5–10 real jobs that best represent your workload (e.g., a full Blender render, your largest compile target, a full video transcode).
  • Lock hardware and firmware
  • Record BIOS/UEFI versions, microcode, and driver versions. Run identical firmware and driver stacks across OS runs wherever possible.
  • Use identical input artifacts
  • Where you can, run the same binary builds on each platform (static cross‑compiled builds or containerized artifacts). Avoid comparing different compiled artifacts when possible.
  • Run multiple iterations
  • Execute each job 5–10 times and discard outliers; use geometric mean for aggregated throughput numbers.
  • Document power/performance settings
  • Default OS profiles are valid when you want “out‑of‑the‑box” comparisons, but also track tuned runs (power plan changes, disabled services) for practical deployment performance targets.
  • Compare WSL2 cautiously
  • If WSL2 is attractive for developer convenience, test both native Linux and WSL2 side‑by‑side for your own build patterns (WSL2’s I/O model can be the largest variable).
  • Pilot and stage
  • For any platform switch, run a small pilot group to exercise real workflows under production loads before wholesale migration.

Risks, unknowns, and unverifiable claims​

  • Any claim that a single OS is “always faster” is wrong: performance depends heavily on workload composition, compiler choices, driver versions, kernel revisions and firmware. The Phoronix snapshot is consistent and repeatable within its stated methodology, but it is still a specific snapshot in time.
  • Results using preview builds and daily snapshots can shift as final releases and vendor driver updates arrive; small percentage points can move after GA drivers and microcode updates. Treat the numbers as directional, not immutable.
  • Forum anecdotes about LTSC or personalized “clean Windows” rigs are useful as hypothesis generators but do not substitute for controlled comparative testing. When claims are not accompanied by hardware/firmware/driver detail they should be treated with caution.

Verdict and practical recommendations​

  • For most end users: there is no practical reason to rush to 25H2 purely for speed; it is an enablement and manageability update rather than a performance rework. Test for compatibility and policy impacts first.
  • For creators with heavy, CPU‑bound batch workloads: consider piloting Linux for render/back‑end servers — the measured 5–15% throughput gains in these tests can be economically meaningful at scale. If migration is impractical, consider hybrid architectures that offload batch jobs to Linux while keeping Windows for interactive tasks.
  • For developers in Windows‑centric environments: WSL2 is excellent for convenience and most development tasks but expect measurable penalties on I/O‑heavy builds — test your real workloads and consider a remote native Linux builder for heavy CI jobs.
  • For IT managers: treat 25H2 as a low‑risk enablement update, but use the rollout window to clean up deprecated tooling (PowerShell 2.0, WMIC) and validate agent/management compatibility before mass deployment.

Closing assessment​

The Phoronix snapshot is consistent with a broader trend observed in recent years: when workloads are heavily multi‑threaded and CPU‑bound, modern Linux kernels plus current toolchains often extract more throughput from high‑core‑count silicon than an out‑of‑the‑box Windows 11 image. That gap is not a single‑issue indictment of Windows; it’s a composite result of kernel scheduler policy, toolchain optimization, filesystem/I/O behavior and default userland noise. Windows 11 25H2’s engineering choice to ship as an enablement package made the outcome predictable: good for operational stability and fast rollout, not a performance revolution.
Administrators, developers, and content teams who care about throughput should validate real workloads on representative hardware and consider hybrid deployments: keep Windows where it’s functionally necessary and use Linux where raw batch throughput is a priority. For the enthusiast bench‑pressing every minute of render time, the evidence says: test, measure, and choose the environment that matches your workload — the numbers in this round favor Linux for sustained CPU work, but a measured migration plan is the prudent path forward.

Source: [H]ard|Forum https://hardforum.com/threads/new-w...-benchmarks-no-games.2044242/post-1046212541/
 

Early independent benchmarking has a clear headline: Windows 11 version 25H2, delivered as an enablement package, does not materially improve raw CPU throughput versus 24H2 — and in a focused, no‑games comparison on high‑core Zen‑5 hardware, a modern Ubuntu snapshot running recent Linux kernels outperformed Windows by a measurable margin in sustained, multi‑threaded creator workloads.

Split image contrasting Windows 11 on PCIe Gen5 NVMe (Ubuntu 25.10) with Linux on Ryzen DDR5 (kernels 6.x).Background / Overview​

Microsoft released Windows 11, version 25H2 as an enablement package that flips features already present (but dormant) on the 24H2 servicing branch to enabled state. That delivery model prioritizes quick, low‑risk rollouts rather than sweeping kernel rewrites or scheduler overhauls. The result: functional parity with 24H2 in most runtime behaviors unless deep changes were already shipped via cumulative updates. Microsoft’s documentation and release notes make this mechanism explicit.
Independent testing by a long‑running benchmarking outlet compared clean installs of:
  • Windows 11 24H2 (baseline),
  • Windows 11 25H2 (preview/release‑preview),
  • Ubuntu 24.04.3 LTS (baseline Linux),
  • Ubuntu 25.10 daily snapshots (development Linux, using Linux 6.16/6.17 kernels).
Hardware for the headline runs was a high‑end Ryzen 9 testbed (16 cores / 32 threads), DDR5 memory, and PCIe Gen5 NVMe storage to avoid I/O bottlenecks. The test suite deliberately focused on CPU‑bound, multi‑threaded creator workloads — Blender CPU renders, LuxCoreRender, Embree, OSPRay, Intel Open Image Denoise, IndigoBench, encoders, compressors and related sustained throughput kernels — rather than gaming or GPU‑bound tests.

What the numbers actually show​

  • Out of 41 cross‑platform CPU‑focused tests, Windows 11 captured only a handful of first‑place finishes while Ubuntu variants took the majority of wins.
  • On the geometric mean across the selected test set, the Ubuntu 25.10 development snapshot was reported to be roughly ~15% faster than Windows 11 25H2 on that Ryzen 9 testbed. This is a composite, workload‑dependent figure — not a promise of universal or single‑threaded gains.
  • Windows 11 25H2 provided no meaningful throughput improvement over Windows 11 24H2 in these tests — a direct consequence of the enablement package approach.
  • Running Ubuntu under WSL2 on Windows 11 also incurred a non‑trivial penalty: the tested Ubuntu 24.04 image under WSL2 delivered about 87% of the throughput of native Ubuntu in the same hardware context (roughly a 13% composite penalty) — with I/O‑heavy workflows showing larger gaps.

Read this carefully​

Those headline percentages are specific to the exact hardware, OS builds, kernel versions, compilers, driver sets, and the exact chosen workloads. They are representative of this testing profile (long, sustained, multi‑threaded CPU jobs) and do not translate directly to interactive desktop latency, gaming, or Windows‑only professional applications.

Why Linux led in these runs — the technical picture​

Several interacting factors explain why a modern Linux snapshot edged Windows in this specific profile:
  • Newer kernel and scheduler optimizations: Ubuntu 25.10 daily snapshots were shipping newer upstream kernels (Linux 6.16/6.17 series) and scheduler/power heuristics that can better exploit high core‑count Zen‑5 behavior for sustained throughput. Those kernel changes affect thread placement, frequency scaling, and cache affinity — all critical for long, parallel jobs.
  • Toolchain and compiler recency: Development snapshots often include newer GCC/Clang toolchains and library builds that produce more aggressive vectorization and codegen for modern ISAs. When cross‑platform workloads are built natively with these toolchains, throughput can improve without any OS scheduler magic.
  • Leaner default userland: A stock Linux desktop or server tends to run fewer persistent legacy services and less telemetry than a default Windows installation. For sustained, CPU‑bound workloads, the result is less “system noise” and more predictable core time.
  • Filesystem and I/O stack choices: For mixed CPU/I/O workloads, Linux offers a wider variety of throughput‑focused filesystem options and tunables which can matter in certain encoder/packaging jobs. The tested suite intentionally used high‑bandwidth storage to minimize I/O bias, but these subtleties still matter for some tests.
  • Binaries and ABI differences: Where identical cross‑platform binaries exist, the comparison is more direct. Where builds diverge (MSVC vs GCC/Clang, different runtime libraries), observed performance may reflect build choices rather than pure OS scheduling. Phoronix tried to use the same binaries where possible, but perfect parity is often impossible.

Methodology caveats and reproducibility​

Benchmarks are useful signals, but they must be read with care. The following experimental variables are often load‑bearing:
  • Firmware and microcode versions (BIOS/UEFI settings, microcode updates).
  • Power management settings (Windows power plan, Linux CPU governor / schedutil vs performance).
  • Background services, telemetry, and virtualization features (e.g., Windows VBS/HVCI).
  • Driver versions (chipset, storage, CPU microcode) and vendor firmware.
  • Compiler and runtime versions used to build test binaries; even compiler flags matter.
  • Measurement technique: repeated runs, geometric mean vs arithmetic mean, and noise‑reduction strategies.
Good benchmarking practice for reproducibility:
  • Record BIOS/UEFI settings and microcode versions for all runs.
  • Use clean installs with identical driver sets where possible.
  • Run each test multiple times and use geomean or median to reduce outliers.
  • Prefer native cross‑platform binaries; if not available, document build flags.
  • Capture system telemetry and background process lists during runs.
Phoronix’s coverage is explicit that these were “first look” runs designed to reflect out‑of‑the‑box behavior on a specific testbed; the authors warned that driver, firmware, or different OS tuning could change outcomes.

Practical implications — who should care and what to do​

For creators, studios, and CI/build farms​

  • If your primary workload is long, CPU‑bound rendering or encoding jobs, the headline 10–15% turnaround improvement is material at scale: multiplied across thousands of render hours, it yields real cost and time savings.
  • Recommendation: run a short pilot on a representative subset of your pipeline using a native Linux image (same compilers, same binaries) and measure end‑to‑end throughput and operational fit before making platform decisions. If most of your rendering tools are cross‑platform and reproducible on Linux, a staged migration or hybrid model (Linux backends, Windows desktops) can be effective.

For developers who must stay on Windows​

  • If corporate policy constrains desktop OS, consider using native Linux servers for heavy batch jobs, or evaluate whether moving builds into containerized Linux CI runners yields faster iteration.
  • If using WSL2 for convenience, be aware there is a measurable cost versus bare metal — Phoronix’s WSL2 runs delivered roughly 87% of native Ubuntu’s throughput in similar tests. For IO‑heavy builds or high throughput needs, native Linux still wins.

For gamers and Windows‑only professionals​

  • These CPU‑bound creator tests do not overturn Windows’ advantages in gaming and many pro Windows‑only toolchains. GPU driver maturity, DirectX support, and vendor‑tuned optimizations still favor Windows for game and many graphics workloads. The testing explicitly excluded gaming.

For enterprise admins and IT​

  • 25H2’s enablement package model simplifies deployment and reduces reboot windows: if you’re already on 24H2 and fully patched, installing 25H2 is a lightweight operation that flips features on. But do not expect a performance windfall from 25H2 alone. Test third‑party management agents, imaging, and scripts in a pilot ring before broad rollout.

For enthusiasts who asked about LTSC / ultra‑clean Windows images​

  • Community commentary (Hard|Forum and similar threads) notes that LTSC / IoT / minimal Windows images can sometimes trim background services and squeeze extra performance in constrained scenarios (low RAM, slow storage), but they generally do not change core scheduling behavior — the underlying Windows scheduler and some default runtime policies remain the same. LTSC can help in specific edge cases, but it isn't a universal performance cure for high‑throughput creator workloads.

Critical analysis — strengths, risks, and what the numbers don't say​

Notable strengths of the findings​

  • The tests are timely and targeted: they reveal how kernel recency and toolchain choices can matter as much as silicon or single‑thread turbo speeds.
  • Using clean installs and out‑of‑the‑box defaults provides a realistic “what most users would see” baseline, valuable for operational decisions.
  • The inclusion of WSL2 comparisons is practically useful for teams who prefer Windows desktops but rely on Linux build tools.

Key risks and limitations​

  • These tests reflect one class of workload (sustained CPU throughput) on one class of hardware (high‑core Zen‑5). They are not universal — different CPUs, especially hybrid architectures like Alder Lake/Grip/others, have shown opposite results in past comparisons where scheduler differences favored Windows or required Linux fixes. Results are architecture and kernel version sensitive.
  • The Ubuntu 25.10 snapshot used development kernels and toolchains; those are subject to change and not identical to LTS behavior. Development snapshots can improve performance but may also introduce instability or regressions. Treat the numbers as an informed signal, not a final verdict.
  • Cross‑platform parity is hard: binary differences and build flags can skew results. Where vendor‑tuned Windows binaries exist (or where drivers are proprietary and optimized for Windows), the balance can shift back.

Unverifiable or provisional claims (flagged)​

  • Any claim framed as “Linux is X% faster in all cases” is not verifiable from these runs alone. The ~15% geomean is accurate to the reported Phoronix snapshot for that hardware and workload, but it should be treated as workload‑specific. If your production jobs differ, you need your own tests.
  • Community posts alleging “Windows scheduler is irredeemably broken” are opinionated and not proven universally; scheduler behavior varies strongly by CPU microarchitecture and kernel version, and Linux has historically had its own scheduler issues on hybrid designs. Use measured data from your workloads.

Actionable testing and tuning checklist (for reproducible comparison)​

  • Inventory: record CPU, exact model, BIOS/UEFI version, microcode, RAM configuration, storage model.
  • Baseline images: prepare clean installs of each OS with identical drivers for chipset/storage where possible.
  • Power policy: normalize power settings (Windows power plan; Linux CPU governor or tuned kernel cmdline).
  • Build parity: use the same compiler/runtime versions or document build flags; prefer official cross‑platform builds when available.
  • Repeatability: run each test 5–10 times, discard warm‑up anomalies, and report geomean and median.
  • Logging: collect temperature, clocks, and background process snapshots to help explain outliers.
  • WSL2 caveat: if using WSL2, test both native and WSL2 paths — don’t assume parity.
  • Long‑term runs: include long sustained runs (minutes to hours) to expose scheduling/thermal/power behavior.

Verdict — what this means for Windows enthusiasts and creators​

These benchmarks underline a straightforward practical truth: operating system choice still matters for certain classes of workloads, and small system‑level differences (kernel version, scheduler tweaks, toolchain recency, and default services) can add up into a meaningful advantage for sustained, multi‑threaded jobs.
  • For most desktop users, everyday productivity, and gaming, Windows remains the pragmatic choice because of application and driver ecosystems.
  • For heavy batch creators, render farms, or CI systems where every percentage point of throughput scales into dollars and time, evaluating Linux as a native platform is worth the effort — but do the math against your toolchain and operational constraints.
  • For administrators, 25H2 is an operational win (smaller, faster enablement installs) — not a performance revolution. Plan rollouts around compatibility and manageability wins, not raw speed.

The Phoronix snapshot (and subsequent coverage) provides a practical, workload‑specific signal: 25H2 ≠ performance uplift. Run your own tests under controlled conditions if throughput matters to you; if you can afford pilot Linux backends for heavy batch jobs, you'll likely reap measurable benefits — but don’t assume those gains will automatically translate to every CPU or every workload.


Source: [H]ard|Forum https://hardforum.com/threads/new-w...inux-6-17-benchmarks-no-games.2044242/latest/
 

Back
Top