Windows 8 2 identical PCs, 1 is unstable (IRQL_NOT_LESS_OR_EQUAL)

Winterseed

Active Member
My brother and I built identical Windows 8 PCs earlier this year, we were both virgin system builders. His has been stable but mine has been crashing a lot, something in the order of 50 BSoD in 5 months. I've been trying to solve it myself but I'm running out of ideas. It has never been stable.

It seems more prone to crashing when it isn't under load. I've played many long sessions of games like FarCry 3 and Skyrim without issue but it will crash while I'm reading a webpage (Chrome, maybe sometimes IE). The other day it completely locked up while I was playing Skyrim at 1080p Ultra and capturing at 1080p with Open Broadcaster Software. The screen went blue with some static at the top (no message) and a sound a little like a tiny jackhammer was emitted either from the speakers or the tower, I'm not sure which. It powered its self off after a few seconds in this state. Since then I've played several hours of Skyrim at the same settings capturing at 720p without a single hiccup. The only time the BSoD has occurred while gaming I was playing and capturing Toribash at 1080p.

The BSoD nearly always suggests searching the web for IRQL_NOT_LESS_OR_EQUAL (notably lacking the DRIVER_ prefix). Once or twice it has been a different suggestion ending in WATCHDOG_VIOLATION. I've run some (15 or so) but not all of the crash dumps through WinDbg (x64) and it seems to blame memory_corruption in most cases, though a range of .sys files seemed to be to blame in a significant number of cases as well.

Memtest86 and Memtest86+ both return occasional errors. In around 20 passes I've had 2 or 3 errors, never more than one in a single session. The longest session I've run was 9 or 10 passes, the second longest was around 6 to 8. The errors I photographed (see attached .zip) were at 4806.3MB (Memtest86+ 5.01) and 4706.5MB (Memtest86 4.3.6) which seems odd because I only have 4 GB of ram. Maybe I'm misunderstanding something there. The Memtest86 error occurred immediately upon commencement of the test. The confidence value increased over time from a little over 100 to 163. Unfortunately I wasn't able to allow that test to run for very long for reasons of domestic diplomacy.

I tried to install all applicable driver updates from my motherboard's official download page when I first put the system together. I'm not sure if my bro did the same with his. I've tried to keep them up-to-date since. Somehow I've been unable to build any confidence in my ability to correctly identify, install or verify drivers. The numbers just don't make sense to me. Control Panel seems to report driver version numbers which don't correlate to what I've tried to install. I may be making a critical oversight there.

I've always kept up-to-date with Windows Update, AMD Catalyst drivers and AVG Free. I've added Malwarebytes more recently.

I moved the DIMM to a different slot a few days ago and have not experienced any notable improvements. The Memtest86 error occurred in the new configuration.

I performed a Refresh of Windows 8 this morning and experienced the BSoD (IRQL_NOT_LESS_OR_EQUAL) twice shortly thereafter. Since reinstalling Windows Updates I've had no crashes. It has only been around 6 hours so that doesn't mean much to me.


I'm not sure how to learn the most from the (50+) crash dumps so I'll wait for some advice on that before I spend more time with them. From this morning (see attached .zip for detail):

BSoD #1: BugCheck A, {fffffae001f3adf8, 2, 1, fffff8015b1413b4}
Probably caused by : memory_corruption ( nt!MiUnlinkFreeOrZeroedPage+1a4 )

BSoD #2: BugCheck A, {fffffae001c4fdf8, 2, 1, fffff8030bad73b4}
Probably caused by : win32k.sys ( win32k!SfnDWORD+15b )

Older MiniDumps are available if anyone wants them. I don't know where to find other relevant files (if there are any).


Unfortunately there's now a lot of (physical) distance between my brother and I so I don't have the option of cross-testing our hardware any time soon. Everything inside our cases is identical with the exception of placement (fan locations, SATA ports used, etc).
Everything outside our cases is different (peripherals, monitors, speakers, etc).

SpeedFan never reports GPU over 80C or CPU over mid 40s (that I've seen). IntelBurnTest and FurMark don't raise any red flags (as far as I can tell).

SPECS:

OS: Windows 8 64-bit OEM
CPU: i5-3570K @stock
MB: ASRock Z77 Extreme4
RAM: Patriot Signature 1600 1x 4GB
HDD: Western Digital Caviar Blue 1TB
GPU: PowerColor HD7870 LE MYST Edition 2GB
PSU: Thermaltake Toughpower XT 775
DVD: Samsung DVD burner
WiFi: TPLink TL-WN881ND wireless card

Cheap IR wireless keyboard and mouse combo (wired USB keyboard also plugged in at rear - possible conflict?)
Sony Bravia KDL-EX720 40" smart TV via HDMI (connected to 7870)


So, holy barnacles, wall of text. Sorry. Any advice on how to proceed? THANKS!

tl;dr Frequent BSoD at idle, IRQL_NOT_LESS_OR_EQUAL, occasional Memtest86 errors, WinDbg mostly reports memory_corruption, I don't know how drivers work
 

Attachments

  • W7F_11-12-2013.zip
    1.4 MB · Views: 890
Code:
*******************************************************************************
*                                                                            *
*                        Bugcheck Analysis                                    *
*                                                                            *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck A, {fffffae001f3adf8, 2, 1, fffff8015b1413b4}

Probably caused by : memory_corruption ( nt!MiUnlinkFreeOrZeroedPage+1a4 )

Followup: MachineOwner

Hi,
if your seeing error's appear in memtest then chances are you have some bad RAM. It's entirely possible to purchase faulty RAM even when new. I also notice you have the stick in slot 3. Usually it's better to place it in the slot nearest the CPU (usually called slot 1) although I very much doubt it will make a difference bsod wise but do try it anyway.


BiosVersion = P2.80
BiosReleaseDate = 01/17/2013
A bios update is available:
http://www.asrock.com/mb/Intel/Z77 Extreme4/?cat=Download&os=BIOS

igdkmd64.sys Fri May 25 05:24:33 2012
Please update Intel Graphics:
https://downloadcenter.intel.com/Default.aspx

I can see that you haven't installed many drivers probably due to your issue. If you can obtain some working RAM then you could use your motherboards website as a guide and find the latest drivers via google. Until you sort the RAM issue however it isn't worth bothering.
 
Switch to linux: virus impervious, none of this, good customer support, developer friendly, good design, lots of free software (actually good software), millions of users, not hours of maintenance, backups take seconds, and better than windows.

or you can just keep using windows... whatever.
 
Thanks for the impartial advice, IHateWindowsEight. I'll think about it.

kemical, I suspected bad RAM. I guess this is a good excuse to move up to 8GB. Part of me still wonders if I just screwed up installing drivers/ Intel INF somewhere along the way. You're right, I've tried to keep things simple since the Refresh to help isolate the problem so I haven't touched any drivers other than Catalyst since. That DriverList still looks pretty big to me, though :s Any idea if drivers are removed by Windows Refresh or not?

Does the CPU-Z grab tell you everything you need to know about voltages? Any chance of an issue there? (I don't see a problem, I'm just not very knowledgeable with voltages)

I think I'll give it some time, it has been stable for a day and a half since Windows Update. If it crashes again I'll update the BIOS and drivers. If it crashes after that, new RAM.

Thanks a lot for the advice. There are some darn decent folk in the PC community :)
 
I didn't see anything out of the ordinary concerning voltages and a Refresh/Reset will remove drivers and programs.
 
I know you said my voltages look normal but I'm wondering if Vcore dipping below 0.85 V at idle is suspicious. Details in post #5 here http://www.pchelpforum.com/xf/threads/2-identical-comps-1-is-unstable-irql_not_less_or_equal.161097/ the link at the start of that post just links back here.

The BIOS change log only states an irrelevant change as I recall. I am chipping away at the drivers, I'm terrible at it and it's exhausting. 2 more BSOD since updating to Windows 8.1, both memory_corruption. Thanks again
 
Hi Winterseed,
what did you do regarding the RAM issue? Until you replace the failing part or parts it's going to be difficult diagnosing other issues.

If possible run memtest86 and take a snapshot of the error so i can see for myself. Ta. :)
 
In every case I have seen this it's been one of two things.

1. DDR memory with a setup/hold timing issue and main board timing such that they are marginal. Change DDR or change main board are the options.

2. Power supply circuit or Power supply low voltage out of spec in which case the main board power to the chip set and/or DDR is out of spec and when a peripheral set pull for more power the slight dip causes the DDR to trip out.

Both of these appear random as they are marginal. Both are highly frustrating to debug. I would start with more DDR RAM and then you can reuse if you see its ok.

Dave
 
Well the culprit is already known really:

Memtest86 and Memtest86+ both return occasional errors. In around 20 passes I've had 2 or 3 errors, never more than one in a single session. The longest session I've run was 9 or 10 passes, the second longest was around 6 to 8. The errors I photographed (see attached .zip) were at 4806.3MB (Memtest86+ 5.01) and 4706.5MB (Memtest86 4.3.6) which seems odd because I only have 4 GB of ram. Maybe I'm misunderstanding something there. The Memtest86 error occurred immediately upon commencement of the test. The confidence value increased over time from a little over 100 to 163. Unfortunately I wasn't able to allow that test to run for very long for reasons of domestic diplomacy.

As I've stated several times you need to sort your RAM out by either purchasing some more or remove all the sticks apart from one. Run the machine, if it runs fine then chances are that stick of RAM is ok. Take it out and try a different stick. Keep doing this until you know which sticks are faulty. If all prove to be faulty try testing in different RAM slots in case you have a bad slot.
 
Thanks for the quick reply. I've only ever had 1 4GB DIMM in this system (I split a kit with my brother). I've had BSOD in slot B1 and B2, the two slots furthest from the CPU. It has been stable for several days using the closest slot to the CPU, A1. I agree that faulty RAM is the most likely cause, I'm currently shopping around for a replacement (I don't live near the store I bought it from. I'll post this DIMM my bro so he can get the kit replaced, I'll buy myself a new kit from wherever).

Windows memory diagnostic is currently running on Extended with caching forced off. I'm wondering if my cache might be the culprit because the MemTest error addresses exceed the amount of RAM in my system. See the .zip in my first post for memtest photos
 
One other thing, I chose this RAM because of its relatively low rated voltage of 1.5V. In my UEFI DRAM Voltage is set to AUTO and it chose 1.585V. Should I manually set it to 1.5V?
 
One other thing, I chose this RAM because of its relatively low rated voltage of 1.5V. In my UEFI DRAM Voltage is set to AUTO and it chose 1.585V. Should I manually set it to 1.5V?
I should leave it to auto as the difference isn't huge. Normally you should always put your first stick in that first slot next to the chip.
 
Dave, thanks for weighing in. Both of your suggestions are beyond my level of expertise. In the first case, when you say change DDR or change the main board, do you mean change settings or replace hardware? If the latter, would I need to source parts which are not identical to resolve this conflict? Or is it just an off-spec issue?

In the second case, I'm guessing I would need dedicated hardware to test, if it's possible at all. My local library has a home energy audit kit which I plan to borrow, though I somehow doubt that has what I need to test for the issue you described.

At any rate, acquiring some new RAM is both a simple and desirable course of action and I'll be doing that. Meanwhile I'll continue to test and speculate.

Just FYI, Windows Memory Diagnostic Tool is showing 7% completion of a pass after around 12 hours set to Extended with cache Off. This thing is not what I'd call quick.
 
Last edited:
To be honest the Windows memory tool isn't up to grade and Memtest86 is the industry standard. Get your results from Memtest86 and don't bother with the onboard windows memory tool.
 
It seems possible that driver or Windows updates solved my issue. I was going to replace my RAM when I visited Adelaide (my capital city) but I didn't experience any instability for (nearly) 3 weeks prior to the visit. While I was there I swapped DIMMs with my brother. That was over a week ago and both of our systems have been stable since. If a few more weeks go by without any issues, I'll consider this solved.

For the record, Windows Memory Diagnostic took 8 days to complete a pass. it was agonizing.

Thanks again for your help, kemical!
 
Back
Top