Debugging Applications for MicrosoftВ® .NET and Microsoft WindowsВ® (Pro-Developer)

Here's the first section of a sample Dr. Watson log:

Application exception occurred: App: (pid=1796) When: 1/2/2003 @ 13:42:56.208 Exception number: c0000005 (access violation)

The header information tells you what caused the crash—in this case, an application exception. The exception numbers for some crashes might not get translated into a human-readable description, such as "access violation" for exception number 0xC0000005. You can see all the possible exception number values by searching for STATUS_ in WINNT.H. The crash values are documented as EXCEPTION_ values returned by the GetExceptionCode function, but the real values are in all the STATUS_ #defines. Once you translate back into the EXCEPTION_ value, you can look up the description for the crash in the GetExceptionCode documentation.

The System Information section should be self-explanatory:

*----> System Information <----* Computer Name: HUME User Name: john Number of Processors: 2 Processor Type: x86 Family 15 Model 0 Stepping 10 Windows 2000 Version: 5.0 Current Build: 2195 Service Pack: 3 Current Type: Multiprocessor Free Registered Organization: Wintellect Registered Owner: John Robbins

The Task List section looks like this:

*----> Task List <----* 0 Idle.exe 8 System.exe 132 smss.exe 160 csrss.exe 156 winlogon.exe 208 services.exe 220 lsass.exe 364 svchost.exe 424 svchost.exe 472 spoolsv.exe 504 MWMDMSVC.exe 528 MWSSW32.exe 576 regsvc.exe 592 MSTask.exe 836 Explorer.exe 904 tp4mon.exe 912 tphkmgr.exe 920 4nt.exe 940 taskmgr.exe 956 tponscr.exe 268 msdev.exe 252 WDBG.exe 828 NOTEPAD.exe 416 drwtsn32.exe 0 _Total.exe

The Task List section shows the processes that were running at the time of the crash. Unfortunately, the list doesn't show the version information, so you'll have to ask the user for the file versions of all the processes in this section. The numbers down the left-hand side are the decimal process IDs (PIDs) at the time of the crash. The numbers are worthless after the fact.

The Module List section contains all the modules loaded into the address space at the time of the crash. The numbers are in the format (load address – maximum address) for each module. This is the first place differences appear between Windows 2000, Windows XP, and Windows Server 2003 Dr. Watson logs. If you're looking at a Windows 2000 log, all you'll see are the address ranges of the modules and nothing else. If you have a Microsoft Visual Studio 6–compiled application and the symbols are accessable, you'll see the name of the loaded symbols next to the address range. Because the DBGHELP.DLL that comes with Windows 2000 knows nothing about Microsoft Visual Studio .NET symbols, you'll never see loaded symbols for binaries compiled with it. Windows 2000 has issues with the modules list, but Dr. Watson on Windows XP and Windows Server 2003 is smart enough to put the names of each DLL next to the address range.

Windows 2000 Modules List (00400000 - 00460000) (77F80000 - 77FFB000) (63000000 - 6301B000) (77E10000 - 77E6F000) (77E80000 - 77F31000) Windows XP Modules List (0000000000400000 - 0000000000460000: d:\Dev\BookTwo\Disk\Output\WDBG.exe (0000000071c20000 - 0000000071c6e000: E:\WINDOWS\System32\NETAPI32.dll (0000000075a70000 - 0000000075b15000: E:\WINDOWS\system32\USERENV.dll (0000000075f40000 - 0000000075f5f000: E:\WINDOWS\system32\appHelp.dll (00000000763b0000 - 00000000763f5000: E:\WINDOWS\system32\comdlg32.dll

On Windows 2000, to figure out what modules were loaded, you have to guess. As I've mentioned several times in this book, it's vital for you to know where your dynamic-link libraries (DLLs) load into the process address space. You can probably recognize your DLLs just from the load addresses. To find the information for the other DLLs on the user's machine, you could write a small utility that would run through the DLLs on the user's system and report their names, load addresses, and sizes.

The following output is the beginning of the three-part thread state section. (You'll notice that I removed the code bytes from beside the disassembly addresses and wrapped the register display so that it would fit on the page.)

*----> State Dump for Thread Id 0xe14 <----* eax=00000000 ebx=00000000 ecx=011305d8 edx=00000a30 esi=00154b40 edi=0012fae4 eip=00410144 esp=0012faa8 ebp=0012faf0 iopl=0 nv up ei pl nz na pe nc cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00000202 function: WDBG!CWDBGProjDoc__HandleBreakpoint 0041012b push esi 0041012c push edi 0041012d push ecx 0041012e lea edi,[ebp-0x40] 00410131 mov ecx,0xd 00410136 mov eax,0xcccccccc 0041013b rep stosd 0041013d pop ecx 0041013e mov [ebp-0x10],ecx 00410141 mov eax,[ebp+0xc] FAULT ->00410144 mov ecx,[eax+0x4] ds:0023:00000004=???????? 00410147 cmp dword ptr [ecx],0x80000003 0041014d jz WDBG!CWDBGProjDoc__HandleBreakpoint+0x90 (004101a0) 0041014f mov [ebp-0x14],esp 00410152 mov [ebp-0x18],ebp 00410155 mov esi,esp 00410157 push 0x456070 0041015c push 0x45606c 00410161 mov edx,[ebp-0x18] 00410164 xor eax,eax 00410166 push eax

Dr. Watson displays the state information for each thread running in a process at the time it crashed. The thread states contain all the information you need to determine how and why the system crashed.

The register portion shows what all the registers contained at the time of the crash. The important register to look at is EIP, the instruction pointer. The example I show from Windows XP has symbols, so you can see which function this thread was executing when the crash occurred—most Dr. Watson logs won't have symbol information. Of course, it's not a problem if Dr. Watson doesn't give you the function name. Using CrashFinder from Chapter 12, simply load up the CrashFinder project for your application, enter the address into the Hexadecimal Address(es) edit control, and click the Find button.

This thread happened to be the thread that crashed. The only indicator is the FAULT-> pointer in the middle of the disassembly. I've seen a Dr. Watson log or two that didn't display the FAULT-> pointer. If you don't see this pointer in a log, run through each thread state and enter each EIP address into CrashFinder to figure out where the thread was sitting at the time of the crash.

The disassembly should look familiar to you if you remember what you read in Chapter 7. The only new elements are the values after the instructions. The Dr. Watson disassembler attempts to look up the effective address of the memory reference for the instruction so that you can see what value the instruction was manipulating. The addresses that start with ss indicate that the memory was a stack segment reference; ds indicates a data segment reference. On Windows XP and Windows Server 2003, the only effective address display will be next to the line that contains the EIP value.

On Windows 2000, the Dr. Watson logs will have an effective address display next to each assembly language line. The only effective address in the disassembly that's guaranteed to be correct is the one at the instruction pointer. The others might be incorrect because the value the instruction refers to could have been changed. For example, let's say that the first instruction disassembled in the thread state had a memory reference at EBX. If the crash instruction occurred 10 instructions later, one of the intervening instructions could easily have changed EBX. When Dr. Watson on Windows 2000 does its disassembly, however, it uses the value currently in EBX—that is, the one at the time of the crash—to do the effective address translation. For this reason, the effective address shown in the disassembly could be incorrect. Carefully check whether any instructions could change the register values before you believe what you see in the effective address display.

Using your newfound assembly-language skills, you should be able to figure out that this thread crashed. The big mistake that everyone makes when reading the assembly language listing from Dr. Watson (or the debugger, for that matter) is starting from the top and reading down. The real trick is to start at the location of the crash and read up the listing, looking for which instructions set the registers of the crashing instruction.

The preceding thread state crashed at 00410144 MOV ECX, [EAX+0x4], where the value in EAX was 0. Since any address less than 64 KB on a Microsoft Windows operating system is marked as no access, attempting to read memory from 0x00000004 isn't a good plan. As you read up the assembly language listing, you'll want to look for the instruction that puts a value of 0 into EAX. Moving up one instruction, you'll see MOV EAX, [EBP+0xC]. Remember that the latter operand, the source, goes into the first operand, the destination. (In other words, remember "source to destination.") That means the value in [EBP+0xC] was copied into EAX. Therefore [EBP+0xC] was 0.

At this point another little trick I taught you in Chapter 7 should be screaming in your head: "parameters are positive"! Parameters are positive offsets from the EBP register with the first one at [EBP+0x8] and each additional parameter above the first by four bytes. Since 0xC is four bytes above 0x8, my hypothesis as to why this function crashed is that the second parameter is NULL. (I hope these last two paragraphs have convinced you how important it is to know enough assembly language to read a Dr. Watson log!)

Here's the second part of the thread state: the Stack Back Trace section. (Notice that I wrapped the function names so that everything would fit on the page. Also, the two underscores (__) in the symbol names is Dr. Watson's way of displaying the scope resolution operator (::).)

*----> Stack Back Trace <----* ChildEBP RetAddr Args to Child 0012faf0 004100cd 00000a30 00000000 80000003 WDBG!CWDBGProjDoc__HandleBreakpoint +0x34 0012fb0c 004075f1 00000a30 0164f8fc 01130b68 WDBG!CWDBGProjDoc__HandleExceptionEvent+0x6d 0012fb20 7c3422b2 00000a30 0164f8fc 0000000d WDBG!CDocNotifyWnd__HandleExceptionEvent+0x21 0012fc28 7c341b2e 00000502 00000a30 0164f8fc MFC71UD!CWnd__OnWndMsg+0x752 0012fc48 7c33f2f0 00000502 00000a30 0164f8fc MFC71UD!CWnd__WindowProc+0x2e 0012fcc0 7c33f7ce 01130b68 002502ca 00000502 MFC71UD!AfxCallWndProc+0xe0 0012fce0 7c3b072a 002502ca 00000502 00000a30 MFC71UD!AfxWndProc+0x9e 0012fd10 77d67ad7 002502ca 00000502 00000a30 MFC71UD!AfxWndProcBase+0x4a 0012fd3c 77d6ccd4 7c3b06e0 002502ca 00000502 USER32!SetWindowPlacement+0x57 0012fda4 77d445bd 00000000 7c3b06e0 002502ca USER32!DefRawInputProc+0x284 0012fdf8 77d447d4 00593330 00000502 00000a30 USER32!TranslateMessageEx+0x78d 0012fe20 77fb4da6 0012fe30 00000018 00593330 USER32!DefWindowProcA+0x209 0012fe64 7c34e8e1 00154af8 00000000 00000000 ntdll!KiUserCallbackDispatcher+0x13 0012fe90 7c34fb4c 00455e30 0012feb8 7c34f407 MFC71UD!AfxInternalPumpMessage+0x21 0012fe9c 7c34f407 00000001 00455e30 00154ac8 MFC71UD!CWinThread__PumpMessage+0xc 0012feb8 7c34fe87 00455e30 00434ffa 0040a66b MFC71UD!CWinThread__Run+0x87 0012fecc 7c34865a 1020c034 102682d0 ffffffff MFC71UD!CWinApp__Run+0x57 0012fef0 00430008 00400000 00000000 00020c22 MFC71UD!AfxWinMain+0xda 0012ff08 004284b8 00400000 00000000 00020c22 WDBG!wWinMain+0x18 0012ffc0 77e814c7 00140000 01f88550 7ffdf000 WDBG!wWinMainCRTStartup+0x1f8 0012fff0 00000000 004282c0 00000000 78746341 kernel32!GetCurrentDirectoryW+0x44

Although my example Windows XP Dr. Watson log has symbols, your user's log probably won't. The RetAddr column lists the return addresses of functions on the call stack. If your user's log doesn't have symbols, all you need to do is load each address in the RetAddr column into CrashFinder to find out the sequence of function calls leading up to the crash.

The Args To Child columns show a function's first three possible parameters on the stack. With highly optimized release builds and no symbols, the values are probably incorrect. However, you can still use them as a starting point for hand-walking your code.

Dr. Watson won't handle symbol values in Windows 2000, so your logs will look something like the following:

*----> Stack Back Trace <----* FramePtr ReturnAd Param#1 Param#2 Param#3 Param#4 Function Name 0012FBA0 004100CD 00000714 00000000 80000003 CCCCCCCC !<nosymbols> 0012FBBC 004075F1 00000714 018EF8FC 016705F8 0012FCD8 !<nosymbols> 0012FBD0 7C3422B2 00000714 018EF8FC 0000000D 016705F8 !<nosymbols> 0012FCD8 7C341B2E 00000502 00000714 018EF8FC 0012FCF4 !Ordinal6841 0012FCF8 7C33F2F0 00000502 00000714 018EF8FC 0013BD01 !Ordinal8666 0012FD70 7C33F7CE 016705F8 00090500 00000502 00000714 !Ordinal1363 0012FD90 7C3B072A 00090500 00000502 00000714 018EF8FC !Ordinal1580 0012FDC0 77E3A244 00090500 00000502 00000714 018EF8FC !Ordinal1581 0012FDE0 77E14730 7C3B06E0 00090500 00000502 00000714 user32!SetWindowPlacement 0012FDFC 77E1558A 00517E40 00000502 00000714 018EF8FC user32!TranslateMessageEx 0012FE24 77FA02FF 0012FE34 00000018 00517E40 00000502 user32!DefWindowProcA 0012FE64 7C34E8E1 00136BC0 00000000 00000000 00000000 ntdll!KiUserCallbackDispatcher 0012FE90 7C34FB4C 00455E30 0012FEB8 7C34F407 00000001 !Ordinal1462 0012FE9C 7C34F407 00000001 00455E30 00136B90 00000002 !Ordinal7046 0012FEB8 7C34FE87 00455E30 00434FFA 0040A66B 0012FEF0 !Ordinal7554 0012FECC 7C34865A 1020C034 102682D0 FFFFFFFF 0012FEF0 !Ordinal7553 0012FEF0 00430008 00400000 00000000 000209D8 00000005 !Ordinal1578 0012FF08 004284B8 00400000 00000000 000209D8 00000005 !<nosymbols> 0012FFC0 77EA847C 0013BD01 0013BD01 7FFDF000 C0000005 !<nosymbols> 0012FFF0 00000000 004282C0 00000000 000000C8 00000100 kernel32!ProcessIdToSessionId

The function names are shown in <module>!<function> format. Those functions shown as Ordinal# are ordinal exports. If you don't have the source code for the DLL that does the exports by ordinal, you're mostly out of luck. However, because you do have the source code for the Microsoft Foundation Class (MFC) library, you can look up MFC ordinal values. In the WDBG program, I know that MFC71UD.DLL is loaded at 0x7C250000, so I can look up those ordinals because all MFC functions are exported by ordinal value through a linker definition (DEF) file.

The one prerequisite for converting MFC ordinal values to functions is that you must be certain of the version of the MFC DLL on the machine that crashed. On my machine, named \\HUME, I had MFC71UD.DLL from Visual Studio .NET 2003. If you're unsure of the version of MFC on the user's machine, you'll either have to ask for it or beg for it—I'll let you choose the method you're most comfortable with.

Follow these simple steps to turn ordinal values into functions:

  1. Open the <Visual Studio .NET installation directory>\VC7\ ATLMFC\SRC\MFC\Intel directory.

  2. Select the appropriate DEF file for the MFC file you want to look up. For example, MFC71UD.DLL's DEF file is MFC71UD.DEF.

  3. Search for the ordinal number. To find Ordinal6841 from the preceding stack, I'd search the MFC71UD.DEF file for 6841. The line with 6841 is "?OnWndMsg@CWnd@@MAEHIIJPAJ@Z @ 6841 NONAME."

  4. The name to the left of "@6841 NONAME" is the decorated function name exported at that ordinal value. To undecorate the name, use the UNDNAME.EXE program that comes with the Visual Studio .NET. For 6841, the function is CWnd::OnWndMsg.

  5. The third part and final part of the thread state is the Raw Stack Dump section:

    *----> Raw Stack Dump <----* 0012faa8 a8 fc 12 00 40 4b 15 00 - cc cc cc cc cc cc cc cc ....@K.......... 0012fab8 cc cc cc cc cc cc cc cc - cc cc cc cc cc cc cc cc ................ 0012fac8 cc cc cc cc cc cc cc cc - cc cc cc cc cc cc cc cc ................ 0012fad8 cc cc cc cc cc cc cc cc - d8 05 13 01 1c fc 12 00 ................ 0012fae8 af 55 43 00 ff ff ff ff - 0c fb 12 00 cd 00 41 00 .UC...........A. 0012faf8 30 0a 00 00 00 00 00 00 - 03 00 00 80 cc cc cc cc 0............... 0012fb08 d8 05 13 01 20 fb 12 00 - f1 75 40 00 30 0a 00 00 .... ....u@.0... 0012fb18 fc f8 64 01 68 0b 13 01 - 28 fc 12 00 b2 22 34 7c ..d.h...(...."4| 0012fb28 30 0a 00 00 fc f8 64 01 - 0d 00 00 00 68 0b 13 01 0.....d.....h... 0012fb38 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 ................ 0012fb48 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 ................ 0012fb58 b2 98 d4 77 85 70 9b c2 - 00 00 00 00 cc a5 30 7c ...w.p........0| 0012fb68 8c fb 12 00 c7 a7 28 7c - 01 00 00 00 4c f5 25 7c ......(|....L.%| 0012fb78 5c 01 00 00 03 00 00 00 - 04 00 00 00 e0 f4 25 7c \.............%| 0012fb88 cc fb 12 00 b4 fb 12 00 - 91 a7 28 7c 4c f2 43 7c ..........(|L.C| 0012fb98 4c f5 25 7c 5c 01 00 00 - 03 00 00 00 04 00 00 00 L.%|\........... 0012fba8 e0 f4 25 7c cc fb 12 00 - 00 00 00 00 4c fc 12 00 ..%|........L... 0012fbb8 b7 b4 39 7c e8 fb 12 00 - 03 00 00 00 04 00 00 00 ..9|............ 0012fbc8 e0 f4 25 7c 68 4f 25 7c - ca 02 25 00 f4 fb 12 00 ..%|hO%|..%..... 0012fbd8 30 0a 00 00 32 b5 39 7c - 00 00 00 00 00 00 00 00 0...2.9|........

I rarely look at this information. If I'm really stuck on a crash, however, I might start looking at the information to see whether I can guess at local variable values. The three return addresses I can correlate with the preceding stack walk are shown in boldface.

Категории