• Thread Author
A wave of reproducible reports and a parallel burst of misinformation have combined to create one of the more consequential Patch Tuesday headaches in recent memory: Microsoft’s August 12, 2025 cumulative for Windows 11 (KB5063878) has been linked to SSDs disappearing under sustained, heavy write workloads, and Phison — a major SSD controller designer — has publicly acknowledged industry‑wide effects while simultaneously denouncing a circulated falsified advisory and saying it is pursuing legal remedies.

Background​

Microsoft shipped the combined servicing stack and cumulative update identified as KB5063878 (OS Build 26100.4946) on August 12, 2025. The update was rolled out as part of Patch Tuesday and packaged fixes that also pulled in prior quality updates. Microsoft’s official KB initially lists no known storage regressions, even as independent testers began reporting a clear failure fingerprint within days of the rollout.
Independent reproductions — starting with an enthusiast test bench and amplified across specialist outlets and forums — converged on a scenario that is narrowly specific, but potentially serious: sustained sequential writes in the tens of gigabytes (commonly mentioned around the ~50 GB mark) against drives that are moderately to heavily filled (commonly reported near or above ~50–60% capacity) can cause some NVMe SSDs to stop responding and disappear from Windows. In many cases the device reappears after a reboot; in a smaller, troubling subset the drive remains inaccessible or returns with corrupted filesystem metadata.
Phison acknowledged it had “recently been made aware of the industry‑wide effects” associated with KB5063878 (and the related KB5062660) and said it was investigating controller families “that may have been affected” while coordinating with partners. At the same time, a document purporting to be an internal Phison advisory — titled, variously, “Phison SSD Controller Issues Summary” — began circulating in distribution lists and on enthusiast channels. Phison called that document falsified and said it would take appropriate legal action.

What the community testing shows​

Reproducible fingerprint​

Multiple independent test benches reproduced a consistent sequence:
  • Install KB5063878 (or the related preview KB5062660).
  • Fill target SSD to a moderate level (community tests often cited ~50–60% used).
  • Execute a sustained, large sequential write — examples include cloning a drive image, installing a large game, or copying tens of gigabytes in a single pass.
  • At or near ~50 GB of continuous writes, the target drive may stop responding, disappear from File Explorer / Device Manager, and become unreachable by vendor utilities and SMART readers. Some drives return after reboot; others do not.
These reproducible conditions are not a universal failure mode across all SSDs. The reports show clusters of failures around certain controller families and especially DRAM‑less designs that rely on Host Memory Buffer (HMB). But other controllers and branded drives have also been implicated in scattered reports, making single‑vendor attribution uncertain at this stage.

Heuristics — the “50 GB / 60%” rule​

Community posts converged on practical heuristics — roughly 50 GB of continuous writes and ~50–60% drive fill — as commonly reproducing the problem in test rigs. Those numbers are useful operational heuristics for triage, not definitive thresholds. They reflect patterns observed in controlled benches rather than deterministic failure criteria for every device in the wild. Treat them as risk indicators to guide mitigation and testing, not as immutable facts.

Technical anatomy — why an OS update can expose SSD fragility​

NVMe SSDs are tightly coupled systems in which the operating system, NVMe driver, PCIe bus, storage controller firmware, and NAND behavior all interact. Small host-side changes — especially those that adjust memory allocation, command ordering, or timing — can surface latent controller firmware edge cases.
Key technical hypotheses under investigation:
  • Host Memory Buffer (HMB) timing/semantics: DRAM‑less drives use HMB to borrow system RAM for mapping tables. If the update changed how Windows allocates, initializes, or tears down HMB, it could expose race conditions or lifetime assumptions in firmware, causing a controller lockup during heavy mapping-table updates.
  • Sustained sequential‑write pressure: Long, continuous writes stress SLC caches, garbage collection, and metadata updates. A change in how the host issues flushes or orders NVMe commands could create a command cadence that some firmware revisions were not designed to handle. Reported unreadable SMART telemetry after incidents supports a controller-level hang or lockup.
  • Platform/BIOS/driver permutations: Reproducibility varies with firmware revision, motherboard UEFI, NVMe driver version, and drive fill state. The same model can behave differently on different motherboards or with different BIOS versions. This multiplicity of variables complicates straightforward attribution.
These mechanisms are consistent with prior host-release incidents where small changes in OS behavior revealed firmware timing assumptions. Conclusive root cause requires telemetry correlation between Microsoft and multiple SSD vendors. That telemetry exchange — not community speculation — is what will validate whether the OS, controller firmware, or a combination is the principal cause.

Vendor and platform responses​

Microsoft​

Microsoft’s official KB for KB5063878 confirms the update’s release on August 12, 2025. The KB initially listed “no known issues” with the update, but Microsoft has engaged partners and said it is investigating customer reports raised via Feedback Hub and vendor channels. Separately, Microsoft moved quickly to address a WSUS delivery error (0x80240069) that affected enterprise deployments of the August cumulative. That WSUS problem has been resolved by Microsoft.
Microsoft routinely uses a mix of telemetry analysis, Known Issue Rollback (KIR), and targeted mitigations when a platform change causes field-impacting behavior. In past cross‑stack incidents, Microsoft has issued KIR or micro‑patches to limit the blast radius while vendors prepare firmware updates. That remains a plausible path here if forensic data points to host behavior as a root contributor.

Phison and SSD vendors​

Phison publicly acknowledged it was investigating “industry‑wide effects” tied to KB5063878 and KB5062660 and said it had engaged industry stakeholders to review potentially affected controllers. The company emphasized partner coordination and promised to provide advisories and remediation through SSD vendors, which is the typical channel for controller firmware distribution.
SSD vendors must test controller firmware against branded SKUs before releasing updates because factory configurations, BOM variations, and vendor utilities can affect a firmware’s safety. Expect vendor‑specific firmware patches to roll out through support dashboards and utilities (e.g., Corsair iCUE, SanDisk Dashboard). Phison’s statement underscores that firmware fixes, when required, will be delivered via partners rather than direct end‑user downloads from Phison.

Independent outlets and corroboration​

Multiple reputable outlets — Tom’s Hardware, BleepingComputer, Windows Central, TechRadar, and others — independently reported the failure fingerprint and the ongoing vendor investigation. Their hands‑on reproductions and community‑collated data sets are the primary early evidentiary basis for the incident. The broad, independent reporting helps validate that this is an industry‑level signal rather than an isolated campaign of noise.

The falsified document and Phison’s legal posture​

A document widely circulated on partner lists and social channels purported to be an internal Phison advisory naming specific controller families and asserting a definitive conclusion: that Phison controllers had “specific and significant issues” and that users should expect permanent data loss. Phison disowned that document, calling it falsified and saying it was neither an official nor unofficial company communication. Phison stated it is taking “appropriate legal action” against those distributing the forged advisory.
Critical points about the falsified‑document episode:
  • Why it matters: A forged advisory that claims exclusive responsibility by a single vendor can cause premature and damaging market reactions — RMAs, mass returns, reputational loss, and misguided engineering responses. The document’s alarmist framing risked diverting partner triage resources and escalating panic.
  • Legal claims vs. verifiable filings: Phison’s public statement asserts legal action; however, independent confirmation of specific court filings, cease‑and‑desist recipients, or named defendants was not publicly available at the time of reporting. Treat the company’s legal posture as a stated intent unless and until formal filings or court notices are published.
  • Possible motives: While speculation has run from opportunistic competitor leaks to malicious disinformation actors, the provenance and motive behind the forged document remain unproven. Forensic tracing of the document’s origin would be required to substantiate claims about who authored or distributed it.
This episode highlights a secondary harm vector in technical incidents: misinformation. Accurate, timely vendor communication is essential not only to drive technical remediation but also to preempt rumor and sabotage that can materially worsen commercial and operational fallout.

Practical guidance for users and IT teams​

The balance of evidence supports a conservative, risk‑first approach while vendors and Microsoft conduct a coordinated investigation and release validated mitigations.
Immediate actions (prioritized):
  • Back up irreplaceable data now to independent media or cloud. Backups are the single most reliable defense against mid‑write corruption and potential permanent loss.
  • If you have not yet installed KB5063878 and your systems perform large sequential writes or use DRAM‑less or Phison‑equipped SSDs, delay deployment until vendors or Microsoft publish guidance. Enterprise update rings should hold the patch for a pilot group and perform targeted validation.
  • Avoid sustained large single‑shot writes (> ~50 GB) on systems that already installed the update; split large transfers into smaller chunks where feasible. This is a practical risk-minimizing step until firmware or host mitigations appear.
  • Inventory SSD models and firmware across your fleet. Prioritize drives that are heavily used and those with DRAM‑less designs for testing. Check vendor dashboards for firmware advisories before attempting updates.
  • If a drive disappears during a write: stop writing, do not initialize or reformat the volume, collect Event Viewer and vendor diagnostics, create a b-level forensic image if possible, and contact vendor support. Capturing logs and telemetry is critical for any potential RMA or forensic recovery.
Technical workarounds and caveats:
  • Some advanced community workarounds (e.g., registry tweaks to limit HMB) have been used historically but carry performance penalties and risk. These should only be considered by experienced administrators with full backups and rollback plans.
  • Do not apply vendor firmware blindly. Only flash firmware that is explicitly validated for your drive part number and vendor SKU, and always ensure backups first. Firmware updates can introduce other regressions if not properly matched to SKU and BOM.

Industry implications and lessons​

This incident is a practical reminder that modern storage reliability is a co‑engineered outcome. The OS, NVMe drivers, controller firmware, motherboard firmware, and NAND characteristics must be stress‑tested as a system.
Strategic improvements the ecosystem should consider:
  • Expanded pre‑release stress testing matrices that include sustained sequential writes, high fill factors, HMB allocation scenarios, and combinations of vendor firmwares and BIOS revisions.
  • A standardized minimal telemetry set that allows Microsoft and SSD vendors to rapidly correlate failure signals without exposing customer data; this would accelerate root‑cause analysis and reduce time‑to‑remediation.
  • Faster vendor advisories that publish confirmed affected firmware IDs rather than relying on noisy community lists, reducing rumor-driven panic and enabling organized mitigation.
  • Clear incident communication channels between platform vendors, controller IC suppliers, and SSD integrators so that firmware patches can be validated and distributed with minimal delay.
The falsified document complication underscores the reputational risk when communications are delayed or ambiguous. Transparent, authoritative advisories mitigate both technical and misinformation harm.

Critical analysis — strengths, weaknesses, and risks​

Strengths in the response so far:
  • Measured vendor posture: Phison’s public acknowledgement avoided premature attribution and emphasized partner coordination and telemetry-driven forensic work. That measured stance reduces the risk of hasty, wrong‑headed remediation steps.
  • Rapid community reproducibility: The fact that multiple independent benches reproduced a consistent fingerprint is a diagnostic positive — it provides actionable test vectors for vendors and Microsoft.
Weaknesses and open risks:
  • Communication gaps: Early public messaging did not enumerate affected firmware IDs or a verified model list. That vacuum allowed noisy community lists to circulate and created the opportunity for fake advisories to be persuasive.
  • Misinformation and legal uncertainty: Phison’s claim of legal action is a serious step intended to deter bad actors, but independent verification of formal filings or outcomes is not yet public. Legal steps may deter circulation but will not fix technical root causes.
  • Operational exposure window: Firmware updates must be SKU‑specific and validated. That process can take time and leaves fleets exposed in the interim, particularly enterprises that prioritize immediate rollouts. Microsoft mitigations (KIR, targeted fixes) may help but require correct attribution.
Overall, the pragmatic posture for organizations is conservative: prioritize backups, stage updates in pilot rings, and wait for vendor‑validated firmware or Microsoft mitigations before broad deployment.

Conclusion​

The August 12 cumulative for Windows 11 (KB5063878) has produced a narrow but credible set of failure fingerprints under heavy, sustained write workloads. Independent test benches and specialist outlets have reproduced a repeatable scenario — the disappearing drive during large writes — and Phison has acknowledged industry‑wide effects while simultaneously denouncing a falsified advisory that misattributes exclusive blame. The technical evidence points toward a host‑to‑controller interaction that will likely require coordinated telemetry analysis, vendor firmware fixes, and possibly platform mitigations.
In the immediate term the defensible, non‑speculative posture is clear and simple: back up critical data, delay or stage the update for at‑risk systems, avoid sustained large writes on patched systems, and follow only vendor‑published firmware advisories. Misinformation — including forged advisories — compounds harm and must be treated with skepticism until validated by vendor statements or legal filings. The path out of this incident is the usual, sober mix of forensic telemetry, coordinated vendor engineering, and cautious operational discipline.

Source: GIGAZINE Fake documents purporting to be from storage manufacturer Phison are spreading about Windows 11 update issues that cause SSD problems, and Phison is taking legal action
 
Silicon Motion’s brief forum reply claiming that “none of our controllers are affected” landed as a hopeful note amid a widening industry investigation into Windows 11’s August 2025 cumulative update (KB5063878) — but the evidence available to date shows a complex, cross‑stack problem that cannot be resolved by a single forum statement and still requires coordinated telemetry, SKU‑level firmware checks, and conservative user action.

Background / Overview​

Microsoft shipped the combined Servicing Stack Update and Latest Cumulative Update for Windows 11 (version 24H2), identified as KB5063878 (OS Build 26100.4946), on August 12, 2025. The package was intended to roll up security fixes and quality improvements, but within days community researchers and specialist outlets began reproducing a repeatable storage regression: during sustained large writes (commonly cited near ~50 GB), some NVMe SSDs briefly disappear from Windows or permanently fail to remount, with a real risk of file corruption on in‑flight writes.
Independent testing and aggregated user reports show this is not a trivial UI glitch. The symptom profile — a target NVMe becoming inaccessible to Device Manager and SMART utilities mid‑write and sometimes remaining unreadable after reboot — strongly suggests a controller‑level hang or a host‑to‑controller interaction that pushes firmware into an unrecoverable state. Multiple specialized outlets and community test benches corroborated this fingerprint.

What users and labs are reporting​

  • Symptom set: While copying large files or installing big games, the target SSD becomes unresponsive and may “vanish” from Windows. SMART and controller identify data can become unreadable. Reboot sometimes restores the drive, but in a minority of cases the drive remains inaccessible or shows corrupted data.
  • Trigger profile: Community reproductions commonly point to sustained sequential writes on the order of ~50 GB or more, especially when the target drive is moderately to heavily filled (commonly noted ~50–60% used). This makes large game installs, archive extractions, and cloning operations high‑risk triggers.
  • Distribution: Early collations over‑represented Phison controller families and some DRAM‑less designs, which accelerated vendor attention; however, isolated reproductions also included drives using other controller vendors. That distribution suggests a host/OS change interacting with common firmware implementation patterns rather than a single vendor’s universal fault.
  • Anecdotes vs telemetry: Microsoft’s public KB for KB5063878 initially listed no known issues for storage, while Microsoft representatives later said the company was “aware of” reports and asked affected customers to file telemetry via Feedback Hub — a sign that Microsoft’s internal telemetry had not yet matched the concentrated reports from enthusiast labs.

Timeline (condensed)​

  • August 12, 2025 — Microsoft releases KB5063878 for Windows 11 24H2.
  • Within 48–72 hours — hobbyist testers and independent outlets reproduce a storage regression during sustained large writes; community lists of implicated models circulate.
  • Mid‑August — Phison confirms it is investigating possible “industry‑wide effects” and coordinates with partners; Microsoft requests telemetry and investigates. (bleepingcomputer.com, wccftech.com)
  • August 18–22, 2025 — specialist outlets aggregate findings, warn users to avoid large writes and back up data while firmware and KB mitigations are developed. (tomshardware.com, itpro.com)

Technical anatomy — how this likely happens​

Modern NVMe SSDs are tightly coupled systems composed of the host OS kernel and NVMe driver, the PCIe link, the NVMe controller firmware, NAND media, and optional DRAM or Host Memory Buffer (HMB). Small changes in host behavior — buffer allocation, command ordering, DMA timing, or HMB allocation semantics — can expose latent firmware edge cases under heavy sustained I/O.
Three leading, non‑exclusive hypotheses supported by community testing and vendor commentary:
  • Controller firmware hang under sustained stress: Large sequential writes stress internal mapping tables, SLC cache, and metadata updates. A subtle change in the host’s timing or buffer semantics can push certain controller firmware state machines into an unrecoverable lock, resulting in the controller becoming unreachable at the PCIe/NVMe level. SMART and NVMe identify commands then fail to return data, which registers as the device “vanishing” from Windows.
  • Host Memory Buffer (HMB) interactions with DRAM‑less designs: DRAM‑less controllers use HMB to cache mapping metadata in host RAM. If Windows changes how HMB is allocated or synchronized under the new update, DRAM‑less controllers may be exposed to timing or allocation races that cause corruption or hangs. Past Windows 11 24H2 rollouts had HMB‑related fragility on select models, making this a plausible vector.
  • OS storage‑stack buffer / driver regression: Kernel-level changes in buffered IO or NVMe driver semantics could cause host‑side timeouts or malformed command sequences that certain controller firmwares do not tolerate gracefully. This scenario would explain why multiple controller families can be implicated in isolated tests.
These hypotheses explain why reproductions depend on a precise combination of firmware revision, drive fill level, sustained‑write profile, motherboard/UEFI settings, and Windows build. Until vendors and Microsoft publish cross‑validated telemetry, the cause remains an interaction problem rather than a simple single‑vendor defect.

Vendor responses so far​

  • Phison: publicly acknowledged it is investigating “industry‑wide effects” related to KB5063878/K5062660 and is cooperating with partners; Phison also took legal action after a falsified document began circulating claiming the problem was unique to their controllers. (wccftech.com, tomshardware.com)
  • Microsoft: asked customers to submit Feedback Hub reports and Support tickets, stating it was “aware of” reports and working with storage partners to reproduce the issue; Microsoft’s KB initially did not mark the update as having a storage-related known issue while investigations continued. (bleepingcomputer.com, support.microsoft.com)
  • Silicon Motion: a TechPowerUp forum thread relays a reply attributed to Silicon Motion stating that “none of our controllers are affected.” That reply appears to be a direct forum response and — as of early reporting — is not backed by a formal press release or support advisory on Silicon Motion’s official channels; readers should treat that single forum claim as provisional until Silicon Motion releases a formal advisory or vendor‑specific firmware confirmations are published.

The Silicon Motion claim — scrutiny and caveats​

The statement circulating in TechPowerUp and echoed in forum collations — that “none of our controllers are affected” — is meaningful to users who actually have Silicon Motion‑based SSDs. However, several important caveats apply:
  • Single reply vs audited telemetry: The quoted claim appears in a community forum post and has not (at the time of writing) been corroborated by a Silicon Motion press release or a support bulletin listing validated firmware IDs and SKU‑level status. A forum reply can be accurate, but it lacks the forensic backing of telemetry‑driven vendor advisories.
  • SKU and firmware nuance: NVMe controllers are integrated into branded SSDs with vendor firmware and PCB-level differences. A controller family may behave differently across SKUs and firmware revisions; vendor advisories typically reference specific firmware versions and branded module SKUs because a controller chip alone is not a warranty of behavior across all integrations. That nuance is why vendors are cautious when issuing public statements.
  • Ongoing investigation: The overall incident is a cross‑stack investigation involving Microsoft and multiple controller vendors. Even if Silicon Motion’s internal testing has found no problematic firmware in their current tested builds, that does not conclusively clear all Silicon Motion‑based modules in the field until manufacturers publish validated lists. Treat the claim as a positive indicator but not definitive proof of universal immunity.
In short: Silicon Motion’s reported forum reply is encouraging for owners of Silicon Motion‑based drives, but prudent users should verify firmware/part numbers with their SSD vendor and continue following conservative mitigations until formal advisories and tested firmware images are published.

Practical guidance for users (immediate actions)​

  • Back up now — verify backups. Prioritize irreplaceable data and maintain at least one offline copy. Data integrity is the single most important defense.
  • Avoid large continuous writes on systems that have installed KB5063878 (or KB5062660 if present). Postpone big game installs, cloning, archive extraction, and bulk file transfers until vendor guidance arrives.
  • If you must transfer large data, split transfers into smaller batches (under ~50 GB where possible) and monitor the drive. Several community reproductions used ~50 GB as a practical trigger threshold.
  • Identify your drive’s controller and firmware: use CrystalDiskInfo, HWInfo, or your vendor’s dashboard tool to capture model, serial, controller family, and firmware build. Keep a screenshot or text export for support cases.
  • Check your SSD vendor’s support page and vendor dashboard for firmware advisories; apply vendor‑recommended firmware updates only after backing up and following the vendor’s instructions (do not attempt firmware flashing without power stability and a tested procedure).
  • If you hit the failure: stop further writes, collect Event Viewer logs and Device Manager screenshots, and submit them to Microsoft via Feedback Hub and to your SSD vendor support. Preserve forensic artifacts where possible.

How to identify if your SSD uses a Silicon Motion controller​

  • Open HWInfo or CrystalDiskInfo and select the NVMe device entry. Look for a line labeled Controller or Controller ID; it can list Silicon Motion, Phison, InnoGrit, Marvell, or other vendors.
  • Use vendor utilities (e.g., Corsair iCUE, Samsung Magician, WD Dashboard) when available — those tools report firmware versions and often supply an option to export a diagnostic log.
  • If the controller line is ambiguous, note the SSD model and serial and check the manufacturer’s support page or product spec PDF — they frequently list the controller family used in the SKU.

Enterprise and IT admin guidance​

  • Hold KB5063878 deployments in broader rings until vendor and Microsoft advisories clarify the affected SKUs and remediation path. Use WSUS, Intune, or your endpoint management platform to control rollouts.
  • For affected endpoints, prioritize forensic preservation: capture Event Viewer logs, Reliability Monitor snapshots, storage vendor logs, and disk images when possible. Avoid reformatting or initializing disks before vendor inspection.
  • Implement a conservative mitigation posture: pause large writes on endpoints that have the update and include storage‑intensive tasks in pre‑deployment test plans for future Windows updates.

Strengths in the industry response — and risks​

Notable strengths:
  • Rapid community triage and reproducible tests turned scattered reports into a credible investigative lead within days, prompting vendor and Microsoft engagement. That rapid feedback loop is a functional strength of the enthusiast and specialist ecosystem.
  • Vendors and Microsoft are coordinating: Phison publicly acknowledged investigations and Microsoft asked for telemetry, which are the right early steps for cross‑stack debugging and targeted mitigations.
Principal risks and shortcomings:
  • Premature attribution: community lists can over‑index on early reproductions and hardware ubiquity (Phison’s controllers are widely used), leading to false certainty about single‑vendor blame. That risk was magnified by a falsified internal document falsely naming affected controllers — a document Phison has since denounced and acted against legally. The circulation of fake documents increases noise and can delay focused remediation.
  • Incomplete vendor messaging: some vendor replies have been informal forum posts rather than SKU‑level advisories or firmware release notes. Informal statements — even when well‑intentioned — can be misread as formal validation if not followed by documented firmware checks. Silicon Motion’s reported forum reply fits this pattern: reassuring, but not yet a validated, SKU‑level advisory.
  • Risk of unsafe workarounds: registry hacks and HMB toggles can reduce exposure in some cases but introduce other stability and performance risks. Those stop‑gaps are for advanced users with full backups and should not be a default enterprise policy.

What to watch next (signals that matter)​

  • Vendor advisories listing confirmed affected firmware IDs and supported SKUs, or a statement confirming no affected SKUs after SKU‑level validation. This is the most reliable clearing evidence.
  • Firmware downloads and clear release notes distributed via vendor dashboards (Corsair, WD, Kioxia, SanDisk, etc.). A posted firmware tied to the regression is the operational fix path vendors will use.
  • Microsoft KB/Release Health updates or a Known Issue Rollback (KIR) entry indicating either an OS mitigation or confirmation that host behavior contributed. Microsoft adding an entry to the KB or issuing an out‑of‑band patch would be the strongest signal of a host‑side correction. (support.microsoft.com, bleepingcomputer.com)

Critical analysis — what this episode reveals about update practices​

This incident underscores a growing coordination challenge in modern PC ecosystems:
  • OS updates now interact with complex hardware subsystems (HMB, DRAM‑less caching, controller metadata management) where the line between “OS bug” and “firmware edge case” is blurred. The right fix path often requires both host and controller changes, and the evidence needed for a safe vendor advisory can be granular and slow to assemble.
  • Vendor fragmentation and SKU diversity increase remediation friction. SSD controllers are integrated across dozens of branded modules; validating a firmware fix across all branded SKUs is a time‑consuming but necessary step before public advisories can be issued. This is why vendors prefer SKU‑validated advisories rather than blanket claims.
  • Community labs remain an essential early‑warning system, but their findings require translation into vendor telemetry. That translation step — collecting consistent traces from affected endpoints — is the gating factor for official fixes.

Final takeaways​

  • The Windows 11 August 2025 cumulative update (KB5063878) is associated with a reproducible storage regression under sustained writes that, in community tests, can make NVMe SSDs disappear and occasionally cause corruption. Multiple independent outlets and test benches corroborate the basic phenomenon. (guru3d.com, tomshardware.com)
  • Phison acknowledged investigations and Microsoft is collecting telemetry; both sides are engaged in a cross‑stack triage.
  • Silicon Motion’s forum reply that “none of our controllers are affected” is a positive datapoint but remains provisional until Silicon Motion publishes a formal advisory or SKU‑level validation. Treat the forum statement as informative but not dispositive.
  • Users and IT administrators should prioritize verified backups, avoid large sequential writes on updated systems, identify controller/firmware versions, and await formal firmware or OS mitigations before resuming heavy write workloads.
The technical fingerprint and vendor engagement point toward a solvable interaction problem — but solving it will require more than isolated assurances. What matters for users now is conservative risk management: verified backups, measured postponement of risky workloads, and careful application of vendor‑validated firmware or Microsoft fixes when they arrive.

Source: TechPowerUp Silicon Motion: None of Our Controllers Affected by the Windows 11 Bug
 
Windows 11’s August cumulative update (KB5063878) has been linked to a reproducible storage regression that can make certain SSDs and HDDs disappear under sustained large writes, and Silicon Motion — according to community‑sourced forum responses — reports none of its controllers have shown the problem so far, a claim that is useful but provisional until confirmed by a formal vendor advisory or SKU‑level testing.

Background / Overview​

Microsoft shipped the Windows 11 24H2 cumulative update identified as KB5063878 (OS Build 26100.4946) as part of its August update cadence. Within days, community testers and specialist outlets reported that heavy sequential writes — commonly in the neighborhood of 50 GB or more — could trigger a storage failure where a target drive becomes unresponsive and may “vanish” from Windows, sometimes remaining inaccessible after a reboot. These early reproductions and collations elevated the issue beyond anecdote to an industry investigation.
Community labs and independent outlets observed a consistent fingerprint: during long, continuous writes the drive stops responding to NVMe admin commands; SMART and telemetry may become unreadable; Device Manager and Disk Management can show the drive as removed; and files being written at the time can be truncated or corrupted. These symptoms strongly suggest a controller-level hang or firmware state failure triggered by host‑side behavior rather than ordinary file‑system noise. (notebookcheck.net, guru3d.com)
Phison, a major SSD controller maker whose silicon appears in many consumer NVMe products, publicly acknowledged it was investigating potential industry‑wide effects linked to KB5063878 (and a related preview patch) and said it was working with partners to determine affected controllers and remediate where necessary. Microsoft confirmed it was “aware of these reports” and asked affected customers to send telemetry via the Feedback Hub while working with storage partners on reproductions. (guru3d.com, bleepingcomputer.com)
Amid the wider investigation, a forum post relayed as coming from Silicon Motion suggested none of SMI’s controllers had shown the problem in their testing or from customer reports. That statement, reported via community forums and collated outlets, has been treated as a positive datapoint for owners of Silicon Motion‑based drives — but it is not a substitute for a formal, SKU‑level advisory or a published firmware validation list.

What exactly is happening — the technical fingerprint​

Symptom set​

  • Target device becomes unresponsive during sustained sequential writes (commonly ~50 GB+).
  • The device may disappear from Device Manager, Disk Management, and vendor utilities.
  • SMART/controller telemetry becomes unreadable or returns errors.
  • In some cases, a reboot restores operation; in others the drive remains inaccessible and may require vendor intervention or reformatting to recover.
  • Files being written during the event can be partially written or corrupted. (notebookcheck.net, tomshardware.com)

Probable technical mechanisms​

Modern NVMe SSDs are embedded systems that rely on a complex interplay of the host OS, NVMe driver, PCIe link, controller firmware, flash translation layer (FTL), and optional on‑board DRAM or Host Memory Buffer (HMB). Two plausible interaction classes explain why an OS update could prompt this failure:
  • Controller firmware hang due to host timing changes — A change in how the OS allocates or synchronizes buffers, interacts with HMB, or orders DMA can push some controller state machines into an unrecoverable state under sustained stress. That state looks like the controller ceasing to respond at the PCIe/NVMe level.
  • HMB / DRAM‑less sensitivity — DRAM‑less controllers that depend on the Host Memory Buffer are particularly sensitive to changes in host allocation semantics. If the host changes timing or mapping behavior, the controller’s expectation of stable HMB semantics can be violated, exposing race conditions that lead to controller stalls.
These hypotheses match the operational fingerprint community labs reproduced, but a definitive root cause requires cross‑stack telemetry from Microsoft and vendors, plus SKU‑level testing across firmware revisions and OEM module integrations.

What vendors have said so far​

  • Phison: Publicly acknowledged the issue, stated it was investigating potential “industry‑wide effects” of the August update and coordinating with partners. That acknowledgement elevated the incident from isolated reports to an organized vendor triage.
  • Microsoft: Said it was “aware of” the reports and requested telemetry from affected customers via the Feedback Hub and support channels while working with storage vendors to reproduce and diagnose. At the time community reproductions surfaced, Microsoft’s KB page for KB5063878 did not immediately list a storage‑related Known Issue.
  • Silicon Motion: A forum relay reported a reply attributed to Silicon Motion stating that none of their controllers had experienced the problem. That forum reply is a useful early signal but, as collated analysis notes, it currently lacks the weight of an official vendor bulletin or SKU‑level listing on Silicon Motion’s support newsroom. Users should treat this as encouraging but provisional until Silicon Motion publishes formal guidance.
Multiple specialist outlets and labs have independently reproduced the failure fingerprint on dozens of drives, with some labs reporting a roughly proportional overrepresentation of Phison‑based and DRAM‑less devices in initial reproductions — yet other controller families and branded SSDs have also surfaced in isolated tests. This mix suggests an interaction rather than a single‑vendor firmware defect in many cases. (tomshardware.com, notebookcheck.net)

Cross‑checking the claims: what’s verified and what remains provisional​

  • Verified: A reproducible storage regression exists in community tests run against Windows 11 builds that include KB5063878 under sustained large writes; multiple outlets have performed reproductions that match the symptoms described. (tomshardware.com, guru3d.com)
  • Verified: Phison publicly acknowledged investigations and Microsoft confirmed awareness and data collection efforts. (guru3d.com, bleepingcomputer.com)
  • Provisional: The claim that all Silicon Motion controllers are unaffected is currently based on a forum reply reported by community collations — a meaningful datapoint for Silicon Motion‑based drive owners, but not a formal, SKU‑level vendor advisory. Until Silicon Motion publishes a support bulletin or a list of validated firmware/part numbers, the “not affected” claim should be treated with caution.
  • Unverifiable at present: Any definitive, universal list of which SSD SKUs are safe or unsafe. Community compilations are helpful triage aids, but firmware differences between branded modules and factory SKUs mean lists can mislead unless vendors publish tested validation matrices.

Practical guidance — what users should do now​

Data integrity must be the immediate priority. The failure mode can result in partial or total file loss on the affected drive.
  • Back up immediately: Copy irreplaceable data to another physical drive or a reliable cloud provider. The industry consensus is that verified backups (and an offline copy where practical) are the single best defense.
  • Avoid large sequential writes: Postpone mass game installs, ISO extractions, disk cloning, VM image writes, media exports, or other sustained writes until vendors or Microsoft publish mitigations. Community reproductions commonly cited ~50 GB as a practical trigger threshold.
  • Identify your drive controller and firmware: Use HWInfo, CrystalDiskInfo, or the SSD vendor’s dashboard utility (Samsung Magician, WD Dashboard, Crucial Storage Executive, Corsair iCUE) to capture model, controller family, and firmware version. Save a screenshot or text export for any support cases.
  • Monitor official vendor channels: Firmware fixes — if required — will be distributed by SSD vendors and posted on their support utilities, usually with release notes tying the update to the regression. Only apply vendor firmware that explicitly addresses this issue and follow the vendor’s instructions carefully.
  • If you hit the bug:
  • Stop writing to the device immediately.
  • Capture Event Viewer entries, Device Manager screenshots, and any vendor utility errors.
  • Reboot only if it’s safe (some devices reappear after reboot; others remain inaccessible).
  • Report the incident to Microsoft (Feedback Hub) and your SSD vendor with logs and diagnostic screenshots. Vendors and Microsoft have asked users to submit telemetry to aid reproduction.
  • Enterprise admins: Hold KB5063878 deployments in broad rings, run representative sustained‑write stress tests on sample hardware and firmware revisions, and stage rollouts only after vendor advisories and firmware validation steps. Use WSUS/Intune controls to pause or rollback the update where risk is unacceptable.

What about Silicon Motion’s “not affected” claim?​

A TechPowerUp forum thread — and subsequent collations by specialist outlets — relayed a reply attributed to Silicon Motion indicating none of the company’s controllers had experienced the issue in their testing or in customer reports to date. For Silicon Motion‑based SSD owners this is good news, but there are important caveats:
  • The forum reply is not the same as a published support advisory. Vendor statements made in community threads can be accurate yet incomplete; they lack the SKU‑level firmware detail that enterprises and cautious consumers require.
  • Controller chips appear in many branded modules with manufacturer‑level firmware and PCB differences. A controller family may behave differently across SKUs and firmware revisions; vendors usually publish tested firmware IDs and supported SKUs when a problem is validated. Until Silicon Motion issues such a publishable matrix, the forum reply remains informative but provisional.
  • The absence of reported problems from Silicon Motion customers is a positive signal but not definitive proof of universal immunity. It’s possible the vendor’s tested population differs from the community test benches or that affected SKUs are rare or not present among their reporting partners. Treat the message as one input among many rather than a final clearing statement.

Why vendor statements matter — and why they’re cautious​

Firmware updates and vendor advisories must be SKU‑specific. A controller vendor can say a chip revision appears unaffected in internal tests, but branded SSD makers produce thousands of SKUs with varied firmware builds, NAND types, and power‑delivery designs. A safe remediation path generally follows these steps:
  • Reproduce the failure on representative hardware and firmware.
  • Isolate the firmware component or host interaction causing the hang.
  • Produce a firmware fix and test it per branded SKU (firmware is often adjusted to match module‑level characteristics).
  • Publish tested firmware via vendor tools and document the affected firmware/part numbers.
  • If the root cause is host‑side, Microsoft may issue a KB update or Known Issue Rollback (KIR) to restore previous behavior while firmware updates propagate.
This process is why vendors are measured in public statements: premature, blanket assertions risk missing rare SKU edge cases and can leave users exposed without an authoritative fix. The goal is not speed only, but correctness and safety in remediation.

Risk assessment — probability and impact​

  • Probability: Low to moderate across the broad installed base, but materially higher for certain controller/firmware/workload combinations (community tests over‑represented Phison-based and DRAM‑less devices).
  • Impact: High for affected users — potential for inaccessible partitions, corrupted files, or drives that require deep recovery or replacement. In short: low probability but high impact, a classic case for defensive posture (backups, testing, paced rollouts).
  • Detectability: Moderate. The core failure is obvious when it happens (drive disappears), but data corruption on unexamined files may not be noticed immediately. That makes immediate backups and careful testing essential.

Short‑term mitigations and advanced stop‑gaps (with caveats)​

  • Splitting large transfers into smaller batches can reduce exposure to the sustained‑write trigger many testers reproduced. This is a pragmatic, low‑risk interim step for users who cannot pause writes entirely.
  • Some advanced community workarounds touched HMB allocation settings via registry or NVMe tweaks to limit HMB usage. These are high‑risk, performance‑impacting, and unsupported for many users — only for experienced users with full backups and a willingness to reverse changes if needed. Do not apply these in production without vendor guidance.
  • Uninstalling the cumulative LCU is technically possible via DISM /Remove‑Package but can complicate future patching and is not recommended for non‑experts. Use Microsoft’s documented rollback procedures or consult enterprise patching teams before attempting such steps.

How to prepare for vendor fixes and post‑fix validation​

  • Keep inventory: capture drive model, serial number, controller family, and firmware version now using HWInfo / CrystalDiskInfo and vendor utilities. This saves time if a vendor requests detailed telemetry.
  • Stage patches in a test ring: apply vendor firmware and Windows updates to a controlled subset of hardware and re-run representative sustained‑write tests before mass rollout. Use WSUS/Intune to stage updates gradually.
  • Preserve forensic artifacts: if you experience the fault, preserve Event Viewer logs, vendor utility reports, and screenshots before making changes. Vendors and Microsoft asked for telemetry to reproduce the failure.

Final analysis: strengths, weaknesses, and how the ecosystem can improve​

This incident underscores several enduring realities about the modern PC ecosystem:
  • Strength — Rapid community triage: Enthusiast test benches and specialist outlets reproducibly identified a clear failure fingerprint in days. That early warning catalyzed vendor and platform triage and prioritized the issue above routine noise. Rapid community reproduction is a strength of the PC enthusiast ecosystem. (tomshardware.com, guru3d.com)
  • Weakness — Coordination friction between OS and hardware vendors: OS updates change low‑level semantics that can expose firmware edge cases across a fragmented SSD marketplace. The need for SKU‑level validation slows public advisories and extends exposure windows for some users. Firmware fixes require careful per‑SKU testing, which is slower than issuing an OS rollback but safer in the long run.
  • Risk — Misinformation and forged documents: The incident saw circulation of falsified advisories in the ecosystem, which created confusion and forced vendors (notably Phison) to denounce forged materials and pursue legal action. Accurate, verified vendor communications are vital during incidents to prevent unnecessary panic and incorrect mitigation steps.
Better pre‑release coordination, shared industry test suites for host/storage interactions, and clearer telemetry pipelines between OS providers and controller vendors would reduce the chance that routine security rollups trigger high‑impact hardware regressions at scale. For now the defensible posture is conservative: back up, avoid heavy writes on systems with KB5063878 until you’ve validated your hardware in a test ring, and follow only vendor‑published firmware and Microsoft guidance.

Conclusion​

The Windows 11 KB5063878 incident is a cautionary example of how tightly coupled modern storage stacks are: a host update can expose latent firmware edge cases in SSD controllers under particular workloads. Multiple independent labs and specialist outlets reproduced a consistent failure fingerprint under sustained large writes; Phison acknowledged an investigation; Microsoft is collecting telemetry; and Silicon Motion’s community‑reported claim that its controllers are unaffected is encouraging but should remain provisional until the vendor publishes formal, SKU‑level confirmation. The immediate, non‑speculative advice is clear and simple: prioritize verified backups, avoid heavy sequential writes on patched systems, identify your SSD controller and firmware, and wait for vendor‑tested fixes or Microsoft mitigations before resuming high‑risk storage workloads. (guru3d.com, bleepingcomputer.com)


Source: OC3D Silicon Motion SSDs unaffected by Windows 11 SSD-crashing bug - OC3D