Windows 10 Frequent, Semi-random BSODs (Thread stuck in device driver)

SpaceKitkat

Member
Joined
Sep 17, 2016
Messages
42
As the title says, almost* every time I launch a video game (specifically, a minecraft mod), I get a BSOD with the error "THREAD_STUCK_IN_DEVICE_DRIVER".
I'm also getting them more and more frequently just using google chrome, even just now while trying to get minidump log.
Checking the log, there was something about atikmdag.sys, and my problem seems to be very similar to this thread:
However, that thread's main solution was when someone checked the logs, they listed drivers that needed to be removed, and I don't think I have any of the same drivers.

After a bit of googling, a couple "easy" solutions were to update my BIOS and reinstall GPU drivers.
So I uninstalled all of my display drivers, deleted everything AMD, used DDU (Display Driver Uninstaller), restarted into my BIOS, updated my BIOS successfully, installed the new GPU drivers (Radeon Crimson edition software, for an R9 290), and restart my computer.

I've also performed several Windows diagnostics, including a RAM check, disk checks on all of my hard drives, and a "sfc /scannow" via CMD. All have come up with no errors.
I tried to add the minidump W7F log, but it said the file size was too large.. lol (RIP).
I've uploaded it to
but if it needs to be uploaded elsewhere, just let me know.
If any other logs are needed, I'll supply them ASAP. Any help is very much appreciated, as I work from home and this is currently my only available computer!
 


Solution
I had a look through your thread and signed up to chime in.

I have an R9 290 and the exact same problem, it has plagued me for the majority of the year. You may stop butchering your PC now.

It is categorically a result of AMD's drivers.

I am not exactly sure at what point the issue arose, but the same very frequent yet erratic BSOD/crashes on wake up have been present for the last dozen or more driver versions.

A poster on another forum found a BIOS altering solution that I didn't want to replicate, but this is what he discovered. It seems related to a new driver based power saving system that lowers the clock speeds of your card during idle situations. This was "improved" at some point over the 16.xxxx series of...
Should I just go ahead with the Furmark test, with the new PSU?
Did you also clean the GPU heatsink and fan? If so try furmark again making sure you monitor temps (oh and if you send another HWiNFO64 result please keep it in the original format)

What is airflow like in the case? Ideally you want plenty of air moving around but please let us know.
 


Yes, cleaned the GPU's heatsink and fans. It blew dust everywhere :P I clearly need to clean them more often, whoops.
I think that was the original format.. .CSV? I don't know what the default format is or if/why it changed. Sorry :o
Airflow is great, I have 4 total fans in the case pushing air outward in each direction. One side, top, back, and front. (not including the PSU's fan on the bottom, or the CPU & GPU's heatsinks and fans). I'll try another Furmark test this morning when I get a chance. Thanks for all of the help so far :)
 


Sounds good... Looking forward to seeing those results.
 


Two more BSODs, I'm not sure what else to do. :(

Edit: I just realized my previous post never actually posted, and I completely forgot what was in it. whoops. I'll just go off the last post; trying another furmark test.
 


Attachments

Code:
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck EA, {ffffaf01fd58f800, 0, 0, 0}

*** WARNING: Unable to verify timestamp for atikmdag.sys
*** ERROR: Module load completed but symbols could not be loaded for atikmdag.sys
Probably caused by : dxgkrnl.sys ( dxgkrnl!TdrTimedOperationBugcheckOnTimeout+3f )

Followup:     MachineOwner
Hi,
as you can see same again. If you could run the furmark test asked about above then that would be great.
Code:
DEFAULT_BUCKET_ID:  GRAPHICS_DRIVER_FAULT

Please use the DDU again in safe mode and then install this driver:
Link Removed

Post any new dumps
 


I finished reinstalling the drivers with DDU in safemode again, and ran a short Furmark test. Once again, the temps were hovering around 92-94 at max load, so I ended it around 7 minutes in. I've uploaded the HWiNFO temperature log to:
Link Removed


Edit: I also gamed a bit the last couple days, and monitoring the temps on my second monitor, they never even reached 80C. :/

Edit2: quick update, another BSOD :( rip. They seem to happen when I'm alt-tabbing between gaming and my second monitor, using google chrome or whatever.
 


Attachments

Last edited:
And another one bites the dust.


Starting to think my only option now is to reinstall windows.
or maybe bite the bullet and get a new graphics card. idk.
 


Attachments

Code:
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck EA, {ffffc200219c1080, 0, 0, 0}

*** WARNING: Unable to verify timestamp for atikmdag.sys
*** ERROR: Module load completed but symbols could not be loaded for atikmdag.sys
Probably caused by : dxgkrnl.sys ( dxgkrnl!TdrTimedOperationBugcheckOnTimeout+3f )

Followup:     MachineOwner
Hi Kitkat,
apologies for not posting sooner.

I can see the AMD driver in the Call stack so the card or it's driver is definitely part of the issue.

Can you try a couple of things please.
First, try running on a single monitor. If I remember correctly one monitor is connect via DVI and the other via HDMI. Try removing the HDMI connected monitor.

Secondly, if the bsod continues try removing the graphics card and use the onchip VGA that's part of your i5-4670K.

Regarding temperature.

The temps you give for when gaming normally sounds about right. Furmark will stress the card so temps will be higher although yours sound a little too high

I also gamed a bit the last couple days, and monitoring the temps on my second monitor, they never even reached 80C. :/

So from this you mean when only using the second monitor your card runs cooler?
 


No worries, I've been posting a bit slowly as well.

I can try running a single monitor over the weekend, but I use two for work. Using just one is incredibly inefficient :( so that test may have to wait a bit. However, my main monitor is using DVI and my (relatively new 144HZ) second monitor is using Display port (which I've never used before until now). Would it be worth it to swap out the DVI and/or Display port cables for HDMI/another DVI/etc? I have extras of each, except only one display port cable.

Hmmm.. removing the graphics card may also have to wait until the weekend.


Sorry, poor wording on my part, lol. I just meant that, while gaming on my main/first monitor, I have HWiNFO up on my second monitor and every 5-10 minutes I would glance over at it and see the temps at a steady 65-70C, and the max temp it had reached was 77 during those gaming sessions.

I googled the error "Module load completed but symbols could not be loaded for atikmdag.sys" out of curiosity, and a few forums talked about how it was a part of CCC (Radeon's control center for the GPU). A few people who installed drivers only, and not the CCC, stopped getting BSODs with that error. Sadly, every time I've gone to the AMD website for drivers, I haven't seen any GPU drivers only. It's just the Radeon Software Crimson Edition (the new name for the CCC)
 


atikmdag.sys
This actually relates more to the driver than the CCC and the type of crash your getting. This is called a TDR and can be caused by many different things. Basically the system detects an issue with the card and tries to reset the driver. If this doesn't happen within 2 seconds then a bsod ensues.

This help page from AMD will explain a little more about the type of crash your getting (TDR) and what it's possible causes may be:
Link Removed
You'll notice we have covered a lot of it but it's good info anyway.

Would it be worth it to swap out the DVI and/or Display port cables for HDMI/another DVI/etc?
Try using all DVI if you can.

Post back with any updates
 


Hello! Sorry about the late reply, I've been really busy lately. Sadly, I haven't had the time to test using all DVI, or removing my graphics card or even just one monitor. However, as I think I previously said (I don't remember tbh) I'm usually using google chrome on my second monitor, while I either work or play video games on my first/main monitor. So I figured I'd try using Mozilla Firefox, because why not? I was a little desperate at that point lol. Ever since I switched to Mozilla, I haven't had a single BSOD since the last time I uploaded the minidump files. That said, I don't like Mozilla, and the reason I switched from Mozilla to Chrome several years ago was because it was so slow compared to Chrome. :( I suppose I'll just have to deal with it for now, until I can find out the exact root of the problem of my BSODs.
 


Well at least your not blue screening which is something at least.. Let us know how things go on and please post any new dump files
 


Can confirm, it is 100% triggered, at least partially, by Chrome being open. I can't confirm whether it's *just* chrome, or chrome being on the second monitor, or Chrome up while gaming.. but after using Firefox for nearly 2 weeks, I haven't had a single BSOD without Chrome up. (had Chrome up one day just to test it again, and I got one BSOD.)
I wish I knew exactly what the cause is, because I really prefer Chrome to Firefox, but for now.. I don't really have (or want to spare, for lack of a better reason) the time to test more.
Thanks for all of your help so far, and if you have any ideas or suggestions please let me know! :)
 


I feel like I've gone in circles now. I got 2 BSODs today, using only firefox in my second monitor while gaming/working (also got an update while working which restarted my pc, because I went afk for few minutes. :mad: windows 10!)
I had forgotten that, when I previously used firefox, I disabled hardware acceleration (forgot why), so I enabled it again about a week ago. Didn't have any problems, except when I accidentally opened chrome while gaming & got a BSOD. Thought it was just chrome causing it, and now I realize I'm not even "safe" with firefox. :(
No, I only have (I think) 2 chrome extensions at the moment; ad block and lights out.
I was getting error messages, but it was with a particular website, and (I believe) they've fixed their website's problem (other people were getting the same issue)
.....is it possible hardware acceleration is causing it, just from using web browsers? Or is it probably something deeper, that hardware acceleration is just triggering, or is it not related to hardware acceleration at all and it's just purely coincidence since I enabled it? lol

I uploaded the two minidumps from today, as well as the one from about a week ago.
 


Attachments

.is it possible hardware acceleration is causing it, just from using web browsers? Or is it probably something deeper, that hardware acceleration is just triggering, or is it not related to hardware acceleration at all and it's just purely coincidence since I enabled it? lol
Very possible actually and I'd disable it asap..

I'll debug those dumps for you, back shortly.
 


Code:
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck EA, {ffffb80b6d08a800, 0, 0, 0}

*** WARNING: Unable to verify timestamp for atikmdag.sys
*** ERROR: Module load completed but symbols could not be loaded for atikmdag.sys
Probably caused by : dxgkrnl.sys ( dxgkrnl!TdrTimedOperationBugcheckOnTimeout+3f )

Followup:     MachineOwner
Hi,
all three dump files were the same and it appears the graphics issue is back. I must admit I did wonder why the issue suddenly changed from gpu to browser and could indicate that there is a third party involved or possibly a hardware issue. I've tried to trace the fault in the dump but it just loops back to the call stack.

Let's try and determine if it's either the gpu or software.

You could try running the driver verifier. First however you'll need to create a Link Removed This is in case the verifier causes a bsod boot loop. If it does, you boot from the Recovery media, enter safe mode and turn the verifier off. The offending driver should be on the dump files.
Link Removed

If the verifier fails to blue screen after 24hrs turn it off.

Enable the onchip gpu carried by the processor, you may have to do this in the bios so do check if it's enabled or not. Turn the machine off and remove the GPU from it's slot. Connect just one monitor to your on-board connection and boot up. See if a blue screen occurs. If it doesn't then chances are it's the GPU that's at fault. If you do get a bsod please post asap.. :) Good luck with the testing.
 


I don't know if this means anything, but hardware acceleration was never on in Chrome, apparently.

Update: I've got everything on my SSD with Windows on it copied to externals, for a possible future fresh-install of Windows (should it come to it). I'm waiting on my brother to return the USB flash drive I lent him, so I can create a windows installer/ISO on it, and a backup of windows.
For now, I found an HDMI long enough to replace the DisplayPort cable for my second monitor, and my extra DVI cable should be arriving soon to test it with that.

I don't think I ever mentioned this, and if I didn't, I meant to! I had upgraded my second monitor (previously a standard ASUS monitor) to a 144hz ASUS "gaming" monitor. I didn't know I needed a DisplayPort cable to get the full usage out of the 144hz monitor, so I'd ordered one and plugged it in, and that was only a week or so before I started getting BSODs (late August/early Sept).

This might just be my mind rambling, but could there be a conflict with the DisplayPort cable, the new 144hz monitor, or the fact that it's my second monitor and I'm using an extended windows display setting? Should I be testing everything with my previous second monitor? with DVI, HDMI, or Display Port connectors?
I don't think I've actually spent more than a few hours without my second monitor on, because I *absolutely* need it for work, but when I'm gaming I could try just using only one monitor and attempting to trigger the BSODs using Chrome, which might confirm a conflict with the second monitor?.. idk.

If you think there could be a conflict with any of the above, what should I be testing for first/second? (I'm planning on testing the driver verifier today, when I get my USB flash drive back and some time to test it)
 


Last edited:
-sigh-
Immediately after I posted that, I turned off my second monitor and opened a couple more tabs in Chrome, one being youtube. A video was playing, and I wasn't hearing any sound (wasn't sure if the video even had sound or not) so I was going to test my sound by opening up Windows' Groove player and play some music really quick (fastest thing I could think of).... aaaaaand I got a BSOD as it was launching Groove. No second monitor, 3 total tabs open in Chrome, and nothing else open in Windows (had just restarted the computer).

Could there be a conflict between my graphics driver, and my separate sound card (ASUS Xonar DGX)?... Over the last couple years, I don't think I've ever had any serious problems between the two, and I'm not sure why I would be getting problems now. :(
 


Attachments

Code:
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck EA, {ffffd507b9966080, 0, 0, 0}

*** WARNING: Unable to verify timestamp for atikmdag.sys
*** ERROR: Module load completed but symbols could not be loaded for atikmdag.sys
Probably caused by : dxgkrnl.sys ( dxgkrnl!TdrTimedOperationBugcheckOnTimeout+3f )

Followup:     MachineOwner
Hi,
I don't know if this means anything, but hardware acceleration was never on in Chrome, apparently.
Ok thank you, I think Chrome crashing was more of a red herring than anything else and the true issue lay's else where.
This might just be my mind rambling, but could there be a conflict with the DisplayPort cable, the new 144hz monitor, or the fact that it's my second monitor and I'm using an extended windows display setting? Should I be testing everything with my previous second monitor? with DVI, HDMI, or Display Port connectors?
Yup it could be down to a number of things which is why we test in a methodical manner. Let's first try and determine if it's the card or software.

Please try running the verifier as outlined above.

If you do not get a bsod within 24hrs turn it off.

Please enable the on board gpu (actually i think it's enabled already but do check via the bios).

Now please remove the graphics card.

Plug your first monitor into the on board connectors and boot up. See if the machine blue screens.

I was checking your RAM via the dump file and things didn't look right. The dump file is showing that you have 4 sticks of RAM although two of the sticks have zero speed as well as hardly any details.
Can you post how many sticks you have and how much RAM that amounts to (I see 33GB approx)
Also please post which manufacturer created the RAM.

Good luck with the verifier and fingers crossed it catches something!
 


Back
Top