Hi
ws1960 — you explained it fine, and yes: this is a “mess” in the normal sense of “multiple things could be failing at once.” The safest way forward is to
separate (A) “can the machine boot reliably?” from (B) “is the RAID/LSI healthy?”.
Below are direct answers to your two questions first, then a practical step-by-step plan.
1) “Can the keyboard be an issue? I only have a wireless all-in-one.”
Yes. Wireless keyboards/mice can behave badly (or partially work) in
BIOS,
LSI/MegaRAID pre-boot utilities, and some bootable USB environments—especially on older boards.
Do this first:
- Use a wired USB keyboard (basic, cheap is fine).
- Plug it into a rear USB 2.0 port (not the front panel, not a hub).
- If your wireless has a USB dongle, remove it while testing (to avoid conflicts).
A bad/unsupported input device usually won’t “crash” the server, but it
can make the MegaRAID screen seem “frozen” because clicks/keys aren’t being processed correctly.
2) “Why can’t I just install a new OS without the LSI being in place?”
Because your Windows Server was almost certainly installed
onto a virtual disk (VD) presented by the
LSI RAID controller.
When you remove the LSI card:
- The motherboard no longer “sees” the RAID virtual disk.
- So the existing OS drive effectively disappears, and you’ll get no boot device (or it boots to the wrong thing).
You can install a new OS without the LSI only if:
- You install to a disk connected to the onboard SATA controller (not behind the LSI), or
- You don’t care about accessing the RAID volumes while the LSI is absent.
If your goal is to keep/restore the RAID data volume, you generally need the
LSI card working (or an identical replacement controller, depending on model/metadata).
What I think is happening (high probability)
From what you described:
- Rebuild stopped halfway
- MegaRAID utility freezes
- System behavior is inconsistent
That often points to
one of these:
1)
Another drive is also failing (RAID6 survives 2 failures, but “survives” doesn’t mean “rebuild will finish” if other disks are throwing errors/timeouts)
2)
Backplane / SATA/SAS cable / power issue causing a drive to drop out during rebuild
3)
LSI RAID controller itself is failing (or its cache module / battery/supercap issue, depending on model)
4) Less commonly: RAM/PSU instability causing lockups under I/O load
Important warning: “Initialize” in RAID tools can wipe metadata / destroy the array. If the data matters, avoid any option that says
Initialize / Clear / New configuration.
Step-by-step plan (safe, practical)
Step 0 — Tell me what matters most
Before you do anything risky:
Is the data on the RAID important and do you have a backup?
- If NO backup and data matters: we should prioritize data-safe diagnostics first.
- If data doesn’t matter: we can focus on reinstalling cleanly.
Reply with:
“data critical” or
“data not critical”.
Step 1 — Reduce variables (basic hardware sanity)
- Power down, unplug power.
- Reseat:
- LSI card (pull and firmly reseat)
- RAM (at least reseat; ideally test with 1 stick at a time if you have multiple)
- Check power:
- Make sure each drive has solid power (no loose splitters)
- Use wired keyboard on rear USB.
Then boot and see if:
- POST completes reliably
- LSI controller banner appears normally (no long hangs)
Step 2 — Identify exactly what “MegaRAID window” you have
There are a couple different pre-boot tools (and the key combos differ by controller/firmware).
Please tell me:
- Exact model of the LSI card (example: “MegaRAID 9260-8i”, “9271-8i”, “9361-8i”, etc.)
- At boot, what does it say? (Even a phone photo of the screen is fine)
If you can’t type/click because of the freeze, take a picture of the frozen screen and upload it here.
Step 3 — Don’t rebuild yet; first check drive state
If you
can get into the RAID configuration:
- Look for:
- How many drives are Online
- Any Foreign config warning
- Any drives showing Unconfigured Bad, Offline, Failed, Rebuild, Predictive Failure
- If rebuild stopped, note:
- Which slot/bay is the replacement drive?
- Any error counters shown in the utility?
If the utility itself freezes consistently: that’s a strong hint the controller is hanging (hardware/firmware) or a device on the bus is locking it up.
Step 4 — Quick isolation test (find the “poison” drive/cable/backplane)
Only do this if data is critical and you’re cautious, but it’s often the fastest way:
- Power off.
- Disconnect the data cable from the RAID backplane (or disconnect the SAS cable to the drive cage), leaving the LSI card installed.
- Boot.
Result interpretation:
- If the machine suddenly stops freezing and the RAID utility becomes responsive → a drive/backplane/cable is likely causing bus lockups.
- If it still freezes even with no drives attached → the LSI card itself (or the PCIe slot/motherboard) is suspect.
(We can refine this further after you tell me your exact cabling layout.)
Step 5 — About reinstalling Windows (clean rebuild option)
If data is
not critical, the cleanest approach is usually:
- Leave the LSI installed (if it’s stable enough to POST)
- Boot Windows Server 2019 installer USB
- During disk selection:
- If no disks appear, click Load driver and provide the correct LSI driver
- Or install to an onboard SATA SSD (not behind the LSI)
But if the machine is freezing
before the installer even loads, we must fix the underlying hardware/LSI issue first.
Questions I need from you (short list)
Reply with as many as you can:
1)
Is the RAID data critical? Do you have a backup?
2) Exact
LSI RAID card model (and if you know: firmware version).
3) Are the
SSD OS drives connected to the LSI or to the
motherboard SATA?
4) When it “won’t boot from USB,” what exactly happens (hang, reboots, message like “no boot device,” etc.)?
5) How many total drives in the RAID6, and what brand/size is the replacement drive?
If you can upload
photos of the boot screen and the frozen MegaRAID screen, that will speed this up a lot. (No attachments came through with your post, so if you tried to attach something, please re-upload.)