About Windows crashes
Operating system crashes are quite different from applications crashes, system hangs or other problems. In most cases, operating systems crash as a protective measure. When the OS discovers
that critical devices are failing or that an internal operating system state has been identified as inconsistent because of
possible viruses, bad device drivers or even RAM failures, it is generally safer to stop immediately. Otherwise, continuing
operations would allow far more serious damage, such as application data corruption or loss.
[HELP IS ON THE WAY: Where to go for help with Windows crashes
]
Two out of three system crashes are caused by third party drivers taking inappropriate actions (such as writing to non-existent
memory) in Kernel mode where they have direct access to the OS kernel and to the hardware.
In contrast, drivers operating in User Mode, with only indirect access to the OS kernel, cannot directly cause a crash. A
small percentage of crashes are caused by hardware issues such as bad memory, even less by faults in the OS itself. And some
causes are simply unknown.
Thanks for the memory dump
A memory dump is the ugliest best friend you'll ever have. It is a
snapshot of the state of the computer system at the point
in time that the operating system stopped. And, of the vast amount of
not-very-friendly looking data that a dump file contains,
you will usually only need a few items that are easy to grasp and
use. With the introduction of Windows 8, the OS now creates
four different memory dumps; Complete, Kernel, and Minidumps and the
new Automatic memory dump.
1. Automatic memory dump
Location: %SystemRoot%\Memory.dmp
Size: ≈size of OS kernel
The Automatic memory dump is the default option selected when you
install Windows 8. It was created to support the "System
Managed" page file configuration which has been updated to reduce the
page file size on disk. The Automatic memory dump option
produces a Kernel memory dump, the difference is when you select
Automatic, it allows the SMSS process to reduce the page
file smaller than the size of RAM.
2. Complete memory dump
Location: %SystemRoot%\Memory.dmp
Size: ≈size of installed RAM plus 1MB
A complete (or full) memory dump is about equal to the amount of
installed RAM. With many systems having multiple GBs, this
can quickly become a storage issue, especially if you are having more
than the occasional crash. Normally I do not advise
saving a full memory dump because they take so much space and are
generally unneeded. However, there are cases when working
with Microsoft (or another vendor) to find the cause of a very
complex problem that the full memory dump would be very helpful.
Therefore, stick to the automatic dump, but be prepared to switch the
setting to generate a full dump on rare occasions.
3. Kernel memory dump
Location: %SystemRoot%\Memory.dmp
Size: ≈size of physical memory "owned" by kernel-mode components
Kernel dumps are roughly equal in size to the RAM occupied by the
Windows 8 kernel. On my test system with 4GB RAM running
Windows 8 on a 64-bit processor the kernel dump was about 336MB.
Since, on occasion, dump files have to be transported, I
compressed it, which brought it down to 80MB. One advantage to a
kernel dump is that it contains the binaries which are needed
for analysis. The Automatic dump setting creates a kernel dump file
by default, saving only the most recent, as well as a
minidump for each event.
4. Small or minidump
Location: %SystemRoot%\Minidump
Size: At least 64K on x86 and 128k on x64 (279K on my W8 test PC)
Minidumps include memory pages pointed to them by registers given their values at the point of the fault, as well as the stack
of the faulting thread. What makes them small is that they do not contain any of the binary or executable files that were
in memory at the time of the failure.
However, those files are critically important for subsequent analysis
by the debugger. As long as you are debugging on the
machine that created the dump file, WinDbg can find them in the
System Root folders (unless the binaries were changed by a
system update after the dump file was created). Alternatively the
debugger should be able to locate them automatically through
SymServ, Microsoft's online store of symbol files. Windows 8 creates
and saves a minidump for every crash event, essentially
providing a historical record of all events for the life of the
system.
Configure W8 to get the right memory dumps
While the default configuration for W8 sets the OS to generate the memory dump format you will most likely need, take a quick
look to be sure. From the W8 Style Menu simply type "control panel" (or only the first few letters in many cases) which will
auto-magically take you to the Apps page where you should see a white box surrounding "Control Panel"; hitting Enter will
take you to that familiar interface.
Make your way to Control Panel in W8.
The path to check Windows 8 Memory Dump Settings, beginning at Control Panel, follows:
Control Panel | System and Security | System | Advanced system settings | Startup and Recovery | Settings
Once at the Startup and Recovery dialogue box ensure that "Automatic memory dump" is checked. You will probably also want
to ensure that both "Write an event to the system log" and "Automatically restart" (which should also be on by default) are
checked.
Install WinDbg
System Requirements
To set your PC up for WinDbg-based crash analysis, you will need the following:
• 32-bit or 64-bit Windows 8/R2/Server 2012/Windows 7/Server 2008
Depending on the processor you are running the debugger on, you can use either the 32-bit or the 64-bit debugging tools. Note
that it is not important whether the dump file was made on an x86-based or an x64-based platform.
• The Debugging Tools for Windows portion of the Windows SDK for Windows 8, which you can download for free from Microsoft.
• Approximately 103MB of hard disk space (not including storage space for dump files or for symbol files)
• Live Internet connection
Download WinDbg
First download sdksetup.exe, a small file (969KB) that launches the Web setup, from which you select what components to install.
• Standard download.
• Automated download (the download will start on its own):
Space required
Ignore the disk space required of 1.2GB; you will only be installing a
small portion of the kit. On my test machine the installation
process predicted 256.2MB but only needed 103MB according to File
Explorer following installation.
Run skdsetup.exe
Install the Software Development Kit (SDK) to the machine that you will use to view memory dump files.
A. Launch sdksetup.exe.
B. Specify location:
The suggested installation path follows:
C:\Program Files (x86)\Windows Kits\8.0\
If you are downloading to install on a separate computer, choose the second option and set the appropriate path.
C. Accept the License Agreement
D. Remove the check marks for all but Debugging Tools for Windows
What are symbols and why do I need them?
Now that the debugger is installed and before calling up a dump file you have to make sure it has access to the symbol files.
Symbol tables are a byproduct of compilation. When a program is compiled, the source code is translated from a high-level
language into machine code. At the same time, the compiler creates a symbol file with a list of identifiers, their locations
in the program, and their attributes. Since programs don't need this information to execute, it can be taken out and stored
in another file. This reduces the size of the final executable so it takes up less disk space and loads faster into memory.
But, when a program causes a problem, the OS only knows the hex address at which the problem occurred, not who was there and
what the person was doing. Symbol tables, available through the use of SymServe, provide that information.
SymServ (SymSrv)
From the Windows 8 UI, right-click on WinDbg then select "Run as administrator" from the bar that pops up from the bottom
of the screen.
SymServ (also spelled SymSrv) is a critically important utility provided by Microsoft that manages the identification of the
correct symbol tables to be retrieved for use by WinDbg. There is no charge for its use and it functions automatically in
the background as long as the debugger is properly configured, and has unfettered access to the symbol store at Microsoft.
Running WinDbg
From the W8 UI, right-click on the version of WinDbg you will use (x64 or x86) then select "Run as administrator" from the
bar that pops up from the bottom of the screen. You will then see a singularly unexciting application interface; a block of
gray. Before filling it in with data you must tell it where to find the symbol files.
Setting the symbol File Path
There is a massive number of symbol table files for Windows because
every build of the operating system, even one-off variants,
results in a new file. Using the wrong symbol tables would be like
finding your way through San Francisco with a map of Boston.
To be sure you are using the correct symbols, at WinDbg's menu bar,
select the following:
File | Symbol file path
In the Symbol search path window enter the following address:
srv*c:\cache*http://msdl.microsoft.com/download/symbols
Note that the address between the asterisks is where you want the symbols stored for future reference. For example, I store
the symbols in a folder called symbols at the root of my c: drive, thus:
srv*c:\symbols*http://msdl.microsoft.com/download/symbols
Make sure that your firewall allows access to msdl.microsoft.com.
How WinDbg handles symbol files
When opening a memory dump, WinDbg will look at the executable files (.exe, .dll, etc.) and extract version information. It
then creates a request to SymServ at Microsoft, which includes this version information and locates the precise symbol tables
to draw information from. It won't download all symbols for the specific operating system you are troubleshooting; it will
download what it needs.
Space for symbol files
The space needed to store symbols varies. In my W8 test machine, after running numerous crash tests, the folder was about
35MB. On another system, running W7, and on which I opened dump files from several other systems the folder was still under
100MB. Just remember that if you open files from additional machines (with variants of the operating system) your folder can
continue to grow in size.
Alternatively, you can opt to download and store the complete symbol file from Microsoft. Before you do, note that - for each
symbol package - you should have at least 1GB of disk space free. That's because, in addition to space needed to store the
files, you also need space for the required temporary files. Even with the low cost of hard drives these days, the space used
is worth noting.
• Each x86 symbol package may require 750 MB or more of hard disk space.
• Each x64 symbol package may require 640 MB or more.
Symbol packages are non-cumulative unless otherwise noted, so if you are using an SP2 Windows release, you will need to install
the symbols for the original RTM version and for SP1 before you install the symbols for SP2.
Create a dump file
What if you don't have a memory dump to look at? No worries. You can generate one yourself. There are different ways to do
it, but the best way is to use a tool called NotMyFault created by Mark Russinovich.
Download NotMyFault
To get NotMyFault, go to the Windows Internals Book page at
SysInternals and scroll down to the Book Tools section where you
will see a download link. The tool includes a selection of options
that load a misbehaving driver (which requires administrative
privileges). After downloading, I created a shortcut from the desktop
to simplify access.
Keep in mind that using NotMyFault WILL CREATE A SYSTEM CRASH and while I've never seen a problem using the tool there are
no guarantees in life, especially in computers. So, prepare your system and have anyone who needs access to it log off for
a few minutes. Save any files that contain information that you might otherwise lose and close all applications. Properly
prepared, the machine should go down, reboot and both a minidump and a kernel dump should be created.
Running NotMyFault
Launch NotMyFault and select the High IRQL fault (Kernel-mode) then . . . hit the Crash button. Your Frown-of-Frustration
will appear in a second, both a minidump and a kernel dump file will be saved and - if properly configured - your system will
restart.
When Windows 8 crashes, you see (1) the Frown-of-Frustration in the new BSOD. After restart you see (2) the offer to send
crash files to Microsoft. The final screen (3) lists the files that would be sent, displays the privacy statement and asks
you for permission to send them.
Over the W8 UI will be a band of blue with the message that "Your PC ran into a problem . . . ". If you click the "Send details"
button, Microsoft will use WinDbg and the command "!analyze" as part of an automated service to identify the root cause of
the problem. The output is combined with a database of known driver bug fixes to help identify the failure.
Launch WinDbg and (often) see the cause of the crash
Launch WinDbg by right-clicking on it from the W8 UI then select "Run as administrator" from the bar that pops up at the bottom
of the screen. Once the debugger is running, select the menu option
File | Open Crash Dump
and point it to open the dump file you want to analyze. Note that WinDbg will open any size dump file; a minidump, kernel
dump or complete dump file. When offered to Save Workspace Information, say Yes; it will remember where the dump file is.
A command window will open. If this is the first time you are using
WinDbg on this system or looking at a dump file from another
system you have not loaded files for before, it may take a moment to
fill with information. This is because the debugger has
to identify the precise release of Windows then go to SymServ at
Microsoft and locate the corresponding symbol files and download
the ones it needs. In subsequent sessions this step is unneeded
because the symbols are saved on the hard drive. Once WinDbg
has the symbols it needs it will run an analysis and fill the window
with the results. This will include basic information
such as the version of WinDbg, the location and name of the dump file
opened, the symbol search path being used and even a
brief analysis offering, in this case,
Probably caused by : myfault.sys
which, of course, we know to be true (myfault.sys is the name of the driver for NotMyFault).
WinDbg Error Messages
If WinDbg reports a *** WARNING or an *** ERROR, the solution is usually simple. The following lists the common messages,
what they mean and how to resolve them.
*** WARNING: Unable to verify timestamp for ntoskrnl.exe
*** ERROR: Module load completed but symbols could not be loaded for ntoskrnl.exe
This is important. When you see these two messages near the beginning of the output from WinDbg, it means that you will not
get the analysis that you need. This is confirmed after the "Bugcheck Analysis" is automatically run, and the message
***** Kernel symbols are WRONG. Please fix symbols to do analysis
is displayed.
Likely causes follow:
• No path/wrong path; a path to the symbol files has not been set or the path is incorrect (look for typos such as a blank
white space). Check the Symbol Path.
• Failed connection; check your Internet connection to make sure it is working properly.
Access blocked; a firewall blocked access to the symbol files or the files were damaged during retrieval. See that no firewall
is blocking access to msdl.microsoft.com (it may only be allowing access to www.microsoft.com).
Note that if a firewall initially blocks WinDbg from downloading a
symbol table, it can result in a corrupted file. If unblocking
the firewall and attempting to download the symbol file again does
not work; the file remains damaged. The quickest fix is
to close WinDbg, delete the symbols folder (which you most likely set
at c:\symbols), and unblock the firewall. Next, reopen
WinDbg and a dump file. The debugger will recreate the folder and
re-download the symbols.
Do not go further with your analysis until this is corrected.
If you see the following error, no worries:
*** WARNING: Unable to verify timestamp for myfault.sys
*** ERROR: Module load completed but symbols could not be loaded for myfault.sys
WinDbg automatically suggests the culprit as shown.
This means that the debugger was looking for information on
myfault.sys. However, since it is a third-party driver, there
are no symbols for it, since Microsoft does not store all of the
third-party drivers. The point is that you can ignore this
error message. Vendors do not typically ship drivers with symbol
files and they aren't necessary to your work; you can pinpoint
the problem driver without them.
So, what caused the crash?
As mentioned above, when you open a dump file with WinDbg it
automatically runs a basic analysis that will often nail the
culprit without even giving the debugger any direct commands as shown
in the screen where it says "Probably caused by : myfault.sys"
More information
Getting a little more information about the crash event and the suspect module is easy. Often, all you need is two commands
among the hundreds that the rather powerful debugger offers:
!analyze -v
and
lmvm.
A new way to command WinDbg
Normally, you would type in the commands and parameters you need.
Things have changed, however, and Windows too. If you take
a good look at the WinDbg interface, just below the "Bugcheck
Analysis" box, it says "Use !analyze -v to get detailed debugging
information" and that the command is underlined and in blue. Yes,
it's a link. Just touch it and the command will be run for
you. But, in case you don't have a touch screen, a mouse will work
fine or resort to the traditional method of typing the
command into the window at the bottom of the interface where you see
the prompt "kd>" (which stands for "kernel debugger").
Be sure to do it precisely; this is a case where syntax is key. For
instance, note the space between the command and the "-v".
The "v" or verbose switch tells WinDbg that you want all the details.
You can do the same where you see the link for myfault
which will display metadata for the suspect driver.
Output from !analyze -v
The analysis provided by !analyze -v is a combination of English and programmer-speak, but it is nonetheless a great start.
In fact, in many cases you will not need to go any further. If you recognize the cause of the crash, you're probably done.
Output from !analyze -v
The !analyze -v command reveals the cause of the crash and the likely culprit.
The !analyze -v provides more detail about the system crash. In this
case it accurately describes what the test driver (myfault.sys)
was instructed to do; to access an address at an interrupt level that
was too high.
Analysis
DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
An attempt was made to access a pageable (or completely invalid) address at an interrupt request level (IRQL) that is too
high. This is usually caused by drivers using improper addresses.
Under Debugging Details the report suggests that the problem was a "WIN_8_DRIVER_FAULT" and that NotMyFault.exe was active.
Stack dump
An important feature of the debugger's output using !analyze -v is
the stack text. Whenever looking at a dump file always
look at the far right end of the stack for any third-party drivers.
In this case we would see myfault. Note that the chronologic
sequence of events goes from the bottom to the top; as each new task
is performed by the system it shows up at the top. In
this rather short stack you can see that myfault was active, then a
page fault occurred, and the system declared a BugCheck,
which is when the system stopped (Blue Screened).
One way to look at this is that when you see a third-party driver active on the stack when the system crashed, it is like
walking into a room and finding a body on the floor and someone standing over it with a smoking gun in his hand; it doesn't
mean that he is guilty but makes him suspect No.1.
Output from lmvm (or by selecting myfault)
Knowing the name of a suspect is not enough; you need to know where
he lives and what he does. That's where lmvm comes in.
It provides a range of data from this image path (not all drivers
live in %systemroot%\system32\drivers.), time stamp, image
size and file type (in this case a driver) to the company that made
it, the product it belongs to, version number and description.
Some companies even include contact information for technical
support. What the debugger reports, though, is solely dependent
upon what the developer included, which, in some cases, is very
little.
After you find the vendor's name, go to its Web site and check for
updates, knowledge base articles, and other supporting
information. If such items do not exist or do not resolve the
problem, contact them. They may ask you to send along the debugging
information (it is easy to copy the output from the debugger into an
e-mail or Word document) or they may ask you to send
them the memory dump (zip it up first, both to compress it and
protect data integrity).
If you have any questions regarding the use of WinDbg, check out the
WinDbg help file. It is excellent. And, when reading
about a command be sure to look at the information provided about the
many parameters such as "-v" which returns more (verbose)
information.
The other third
While it's true that, by following the instructions above, you'll likely know the cause of two out of three crashes immediately;
that does leave that annoying other third. What do you do then? Well, the list of what could have caused the system failure
is not short; it can range from a case fan failing, allowing the system to overheat, to bad memory.
Sometimes it's the hardware
If you have recurring crashes but no clear or consistent reason, it may be a memory problem. Two good ways to check memory
are the Windows Memory Diagnostic tool and Memtest86. Go to Control Panel and enter "memory" into its search box then select
"Diagnose your computer's memory problems".
This simple diagnostic tool is quick and works great. Many people discount the possibility of a memory problem, because they
account for such a small percentage of system crashes. However, they are often the cause that keeps you guessing the longest.
Is Windows the culprit?
In all probability: no. For all the naysayers who are quick to blame
Redmond for such events, the fact is that Windows is
very seldom the cause of a system failure. But, if ntoskrnl.exe
(Windows core) or win32.sys (the driver that is most responsible
for the "GUI" layer on Windows) is named as the culprit -- and they
often are - don't be too quick to accept it. It is far
more likely that some errant third-party device driver called upon a
Windows component to perform an operation and passed
a bad instruction, such as telling it to write to non-existent
memory. So, while the operating system certainly can err, exhaust
all other possibilities before you blame Microsoft.
What about my antivirus driver?
Often you may see an antivirus driver named as the culprit but there
is a good chance it is not guilty. Here's why: for antivirus
code to work it must watch all file openings and closings. To
accomplish this, the code sits at a low layer in the OS and
is constantly working so that he will often be on the stack of
function calls that was active when the crash occurred.
Missing vendor information?
Some driver vendors don't take the time to include sufficient information with their modules. So if lmvm doesn't help, try
looking at the subdirectories on the image path (if there is one). Often one of them will be the vendor name or a contraction
of it. Another option is to search Google. Type in the driver name and/or folder name. You'll probably find the vendor as
well as others who have posted information regarding the driver.