Lock Picking Timer

I recently built a timed lock picking competition, based on a similar design created by @dossman33. It consists of four deadbolts (plus two spares), all pinned identically, that, when opened, trigger a switch wired to a Raspberry Pi. The Pi has a UI written in Python that shows the current elapsed time, plus the time that each station opened. I’ve shared the code at https://github.com/mburrough/locktimer.


(BSides Lockpick Village images from @deviantollam)

Continue Reading“Lock Picking Timer”

Master Key Collusion Attack Simulator

As both a locksport enthusiast and a professional red teamer, I spend a good deal of time thinking about locks. One fascinating subset of locks is master keyed systems. These are primarily used by large businesses to create a hierarchy of locks and keys: individuals can have a key that works for the building front door and their own offices, managers can have keys that work on any of their subordinates’ offices, janitors can have keys to their assigned floors’ offices, while security guards and maintenance can have keys to any room.

Depending on how these systems are implemented, they may suffer from an inherent weakness. Often, the bitting (key cuts) used for the top master key (e.g. the one used by security) cannot be used in any individual user key. For example, if the top master key code is 6-3-4-2-2, a valid user key may be 4-1-6-4-5, but could not be 4-1-6-2-5. This means that given the codes for enough user keys, an attacker could eliminate most/all possibilities except for the master key bitting, thereby decoding the master key.

This may sound like it would take a large number of keys to find the master (especially for keys with 6 or 7 cuts and 7+ possible cut depths per position), there is another feature of many key systems that helps narrow the space: MACS. MACS are the Maximum Adjacent Cut Specification for a given brand/model. These are standards that say a key cannot go from a very small cut in one position to a very large cut in the next position, as the key could be too weak and might break off in a lock (or pocket, or purse).

Similar to MACS, some systems also restrict user keys from being within a certain offset (e.g. +/- 1) of the master. In such a system, 4-1-6-1-5 might also be invalid with the above example master key.

For this kind of attack, the attacker would either need to covertly view users’ keys and eliminate cuts, or work with insider co-conspirators to view their keys. But would the number of keys needed to be viewed be feasible? To answer this, I created a computer model that simulates the attack and displays the number of keys needed for that attack. Simply input the specifications for the key system in question, plus the number of runs desired for the model, and it will display the average, minimum, and maximum number of keys needed across the test executions.

You can get the code from https://github.com/mburrough/MasterKeySim.

PenTesting Azure Applications

I’m excited to announce my first book, PenTesting Azure Applications, is available for preorder from Amazon and No Starch’s site. This title aims to provide a reference guide for security professionals looking to assess cloud deployments. It offers practical advice for how to properly lock down subscriptions and their services, and make use of Azure’s monitoring and security features.

Don’t Believe Everything You Read

This is an archive post of content I wrote for the NTDebugging Blog on MSDN. Since the MSDN blogs are being retired, I’m transferring my posts here so they aren’t lost. The post has been back-dated to its original publication date.

Recently, I was contacted by a customer who was advised by an ISV to set a registry value under one of the sub keys in HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager. Let’s call it UseQuantumComputing = 1 (value name has been changed to protect the ISV). The customer wanted to know what this value actually did and no one could find any documentation explaining it. These issues often come to our team because we have access to the Windows source code. I did a bit of code review to find out what this value does. As it turns out, nowhere in Windows source code between Windows 2000 and Windows Server 2012 do we ever check for or set UseQuantumComputing.

I can think of a few reasons the ISV would suggest setting this value. Perhaps they were under the impression this did something but got confused about the value name. It’s possible they hoped making a registry change would have a placebo effect. Or, perhaps their software actually checks this value, not Windows.

The latter of these possibilities is probably the worst case scenario. An ISV should not create a registry value inside of keys used for Windows’ own internal use. Why? The first reason is that there’s no guarantee that Microsoft won’t end up coincidentally using that same value name later. This would cause a conflict between the two users of the value. Second, we have to consider the scenario where two different ISVs both decide to use the same value. That would be bad too. Lastly, there’s no guarantee that the key in use will still exist in later versions, or that it will be writable or readable by the ISV due to permission changes. In addition to all these reasons, there is the common sense issue that it is just confusing. Now the ISV’s software and uninstaller needs to look all over the registry, not just in their own keys.

On a similar note, I also recently had a case where a “Windows Tips” blog (not created, endorsed, or run by Microsoft) suggested using a registry value that was implemented in Windows but was not documented by Microsoft. It turns out this value wasn’t thoroughly tested (because it was undocumented and wasn’t intended to be used in production), and using it would cause server hangs under certain conditions. These hangs were only discovered after a large customer decided to implement the undocumented value across their enterprise.

Here are a few tips for IT Pros, developers, and users alike:

  • Don’t implement random registry settings if you can’t find documentation for that setting on an official Microsoft site, like MSDN, TechNet, or support.microsoft.com(information on forums or answer boards (e.g. social.*.microsoft.com or answers.*.microsoft.com) is not official documentation). At best these unknown registry settings they will do nothing, at worst they will cause you headaches later.
  • If a key/value isn’t documented, changes to it likely are not tested, and could put your machine in a state that makes it difficult or impossible to support.
  • If you are a developer, keep any of your registry settings in your own key. Don’t pollute in others’ keys.
  • If an ISV or Microsoft suggests you implement a setting, make sure you understand the implications of that setting.

I’ll leave you with the warning displayed in many of our KBs – it’s there for a reason!

WARNING: If you use Registry Editor incorrectly, you may cause serious problems that may require you to reinstall your operating system. Microsoft cannot guarantee that you can solve problems that result from using Registry Editor incorrectly. Use Registry Editor at your own risk.”

Microsoft KB Registry Warning Message


Determining the source of Bug Check 0x133 (DPC_WATCHDOG_VIOLATION) errors on Windows Server 2012

This is an archive post of content I wrote for the NTDebugging Blog on MSDN. Since the MSDN blogs are being retired, I’m transferring my posts here so they aren’t lost. The post has been back-dated to its original publication date.

What is a bug check 0x133?

Starting in Windows Server 2012, a DPC watchdog timer is enabled which will bug check a system if too much time is spent in DPC routines. This bug check was added to help identify drivers that are deadlocked or misbehaving.  The bug check is of type “DPC_WATCHDOG_VIOLATION” and has a code of 0x133.  (Windows 7 also included a DPC watchdog but by default, it only took action when a kernel debugger was attached to the system.)  A description of DPC routines can be found at http://msdn.microsoft.com/en-us/library/windows/hardware/ff544084(v=vs.85).aspx.

The DPC_WATCHDOG_VIOLATION bug check can be triggered in two ways. First, if a single DPC exceeds a specified number of ticks, the system will stop with 0x133 with parameter 1 of the bug check set to 0. In this case, the system’s time limit for single DPC will be in parameter 3, with the number of ticks taken by this DPC in parameter 2. Alternatively, if the system exceeds a larger timeout of time spent cumulatively in all DPCs since the IRQL was raised to DPC level, the system will stop with a 0x133 with parameter 1 set to 1. Microsoft recommends that DPCs should not run longer than 100 microseconds and ISRs should not run longer than 25 microseconds, however the actual timeout values on the system are set much higher.

How to debug a 0x133 (0, …

In the case of a stop 0x133 with the first parameter set to 0, the call stack should contain the offending driver. For example, here is a debug of a 0x133 (0,…) kernel dump:

0: kd> .bugcheck
Bugcheck code 00000133
Arguments 00000000`00000000 00000000`00000283 00000000`00000282 00000000`00000000

Per MSDN, we know that this DPC has run for 0x283 ticks, when the limit was 0x282.

0: kd> k

Child-SP          RetAddr           Call Site

fffff803`08c18428 fffff803`098525df nt!KeBugCheckEx
fffff803`08c18430 fffff803`09723f11 nt! ??::FNODOBFM::`string'+0x13ba4
fffff803`08c184b0 fffff803`09724d98 nt!KeUpdateRunTime+0x51
fffff803`08c184e0 fffff803`09634eba nt!KeUpdateTime+0x3f9
fffff803`08c186d0 fffff803`096f24ae hal!HalpTimerClockInterrupt+0x86
fffff803`08c18700 fffff803`0963dba2 nt!KiInterruptDispatchLBControl+0x1ce
fffff803`08c18898 fffff803`096300d0 hal!HalpTscQueryCounter+0x2
fffff803`08c188a0 fffff880`04be3409 hal!HalpTimerStallExecutionProcessor+0x131
fffff803`08c18930 fffff880`011202ee ECHO!EchoEvtTimerFunc+0x7d  //Here is our driver, and we can see it calls into StallExecutionProcessor
fffff803`08c18960 fffff803`097258b4 Wdf01000!FxTimer::TimerHandler+0x92
fffff803`08c189a0 fffff803`09725ed5 nt!KiProcessExpiredTimerList+0x214
fffff803`08c18ae0 fffff803`09725d88 nt!KiExpireTimerTable+0xa9
fffff803`08c18b80 fffff803`0971fe76 nt!KiTimerExpiration+0xc8
fffff803`08c18c30 fffff803`0972457a nt!KiRetireDpcList+0x1f6
fffff803`08c18da0 00000000`00000000 nt!KiIdleLoop+0x5a

Let’s view the driver’s unassembled DPC routine and see what it is doing:

0: kd> ub fffff880`04be3409
fffff880`04be33e0 448b4320        mov     r8d,dword ptr[rbx+20h]
fffff880`04be33e4 488b0d6d2a0000  mov     rcx,qword ptr [ECHO!WdfDriverGlobals (fffff880`04be5e58)]
fffff880`04be33eb 4883631800      and     qword ptr [rbx+18h],0
fffff880`04be33f0 488bd7          mov     rdx,rdi
fffff880`04be33f3 ff150f260000    call    qword ptr [ECHO!WdfFunctions+0x838(fffff880`04be5a08)]
fffff880`04be33f9 bbc0d40100      mov     ebx,1D4C0h
fffff880`04be33fe b964000000      mov     ecx,64h
fffff880`04be3403 ff15f70b0000    call    qword ptr[ECHO!_imp_KeStallExecutionProcessor (fffff880`04be4000)]   //It's calling KeStallExecutionProcessor with 0x64 (decimal 100) as a parameter

0: kd> u fffff880`04be3409
fffff880`04be3409 4883eb01        sub     rbx,1
fffff880`04be340d 75ef            jne     ECHO!EchoEvtTimerFunc+0x72 (fffff880`04be33fe)     //Here we can see it is jumping back to call KeStallExecutionProcessor in a loop
fffff880`04be340f 488b5c2430      mov     rbx,qword ptr[rsp+30h]
fffff880`04be3414 4883c420        add     rsp,20h
fffff880`04be3418 5f              pop     rdi
fffff880`04be3419 c3              ret
fffff880`04be341a cc              int     3
fffff880`04be341b cc              int     3

0: kd> !pcr
KPCR for Processor 0 at fffff80309974000:
    Major 1 Minor 1
      NtTib.ExceptionList: fffff80308c11000
          NtTib.StackBase: fffff80308c12080
         NtTib.StackLimit: 000000d70c7bf988
       NtTib.SubSystemTib: fffff80309974000
            NtTib.Version: 0000000009974180
        NtTib.UserPointer: fffff803099747f0
            NtTib.SelfTib: 000007f7ab80c000

                  SelfPcr: 0000000000000000
                     Prcb: fffff80309974180
                     Irql: 0000000000000000
                      IRR: 0000000000000000
                      IDR: 0000000000000000
            InterruptMode: 0000000000000000
                      IDT: 0000000000000000
                      GDT: 0000000000000000
                      TSS: 0000000000000000

            CurrentThread: fffff803099ce880
               NextThread: fffffa800261cb00
               IdleThread: fffff803099ce880

                DpcQueue:  0xfffffa80020ce790 0xfffff880012e4e9c [Normal] NDIS!NdisReturnNetBufferLists
                           0xfffffa800185f118 0xfffff88000c0ca00 [Normal] ataport!AtaPortInitialize
                           0xfffff8030994fda0 0xfffff8030972bc30 [Normal] nt!KiBalanceSetManagerDeferredRoutine
                           0xfffffa8001dbc118 0xfffff88000c0ca00 [Normal] ataport!AtaPortInitialize
                           0xfffffa8002082300 0xfffff88001701df0 [Normal] USBPORT

The !pcr output shows us queued DPCs for this processor. If you want to see more information about DPCs and the DPC Watchdog, you could dump the PRCB listed in the !pcr output like this:

dt nt!_KPRCB fffff80309974180 Dpc* 

Often the driver will be calling into a function like KeStallExecutionProcessor in a loop, as in our example debug.  To resolve this problem, contact the driver vendor to request an updated driver version that spends less time in its DPC Routine.

How to troubleshoot a 0x133 (1, …

Determining the cause of a stop 0x133 with a first parameter of 1 is a bit more difficult because the problem is a result of DPCs running from multiple drivers, so the call stack is insufficient to determine the culprit.  To troubleshoot this stop, first make sure that the NT Kernel Logger or Circular Kernel Context Logger ETW traces are enabled on the system.  (For directions on setting this up, see http://blogs.msdn.com/b/ntdebugging/archive/2009/12/11/test.aspx.)

Once the logging is enabled and the system bug checks, dump out the list of ETW loggers using !wmitrace.strdump. Find the ID of the NT Kernel logger or the Circular logger. You can then use !wmitrace.logsave (ID) (path to ETL) to write out the ETL log to a file. Load it up with Windows Performance Analyzer and add the DPC or DPC/ISR Duration by Module, Function view (located in the Computation group) to your current analysis window:


Next, make sure the table is also shown by clicking the box in the upper right of the view:

Enable Graph and Table view in WPA
Enable Graph and Table view in WPA

Ensure that the Address column is added on the left of the gold bar, then expand each address entry to see individual DPC enters/exits for each function. Using this data, you can determine which DPC routines took the longest by looking at the inclusive duration column, which should be added to the right of the gold bar:

DPC Duration
DPC Duration

In this case, these DPCs took 1 second, which is well over the recommended maximum of 100 us. The module column (and possible the function column, if you have symbols) will show which driver is responsible for that DPC routine. Since our ECHO driver was based on WDF, that is the module named here.

For an example of doing this type of analysis in xperf, see http://blogs.msdn.com/b/ntdebugging/archive/2008/04/03/windows-performance-toolkit-xperf.aspx.

More Information

For additional information about Stop 0x133 errors, see this page on MSDN: http://msdn.microsoft.com/en-us/library/windows/hardware/jj154556(v=vs.85).aspx.

For DPC timing recommendations and for advice on capturing DPC timing information using tracelog, see http://msdn.microsoft.com/en-us/library/windows/hardware/ff545764(v=vs.85).aspx.

Guidelines for writing DPC routines can be found at http://msdn.microsoft.com/en-us/library/windows/hardware/ff546551(v=vs.85).aspx.

-Matt Burrough

How the Clipboard Works, Part 2

This is an archive post of content I wrote for the NTDebugging Blog on MSDN. Since the MSDN blogs are being retired, I’m transferring my posts here so they aren’t lost. The post has been back-dated to its original publication date.

Last time, we discussed how applications place data on the clipboard, and how to access that data using the debugger.  Today, we’ll take a look at how an application can monitor the clipboard for changes.  Understanding this is important, because it is a place where Windows allows 3rd-party code to “hook” into the system – so if you experience unexpected behavior with copying and pasting, a program using these hooks may be misbehaving.  We’ll start by covering the hooking mechanisms for clipboard, and then review how to identify what applications, if any, are using these hooks in the debugger.

There are three ways to monitor the clipboard for changes – clipboard viewers, clipboard format listeners, and querying the clipboard sequence number.  We will focus on the first two as these allow an application to register for notifications whenever the clipboard is updated.  The third method simply allows an application to check and see if a change has occurred, and should not be used in a polling loop.

The Clipboard Viewer functionality has been around since Windows 2000, if not earlier.  The way it works is pretty simple – an application interested in receiving clipboard change notifications calls SetClipboardViewer and passes a handle to its window.  Windows then stores that handle in a per-session win32k global, and anytime the clipboard changed, windows sends a WM_DRAWCLIPBOARD message to the registered window.

Of course, multiple applications are allowed to register their windows as clipboard viewers – so how does Windows handle that?  Well, if an application calls SetClipboardViewer and another clipboard viewer was already registered, Windows returns the handle value of the previous viewer’s window to the new viewer.  It is then the responsibility of the new viewer to call SendMessage every time it receives a WM_DRAWCLIPBOARD to notify the next viewer in the chain.  Each clipboard viewer also needs to handle the WM_CHANGECBCHAIN message, which notifies the viewers when one of the viewers in the chain was removed and specifies what the next viewer in the chain is.  This allows the chain to be maintained.

An obvious problem with this design is it relies on each clipboard viewer application to behave correctly, not to terminate unexpectedly, and to generally be a good citizen.  If any viewer decided not to be friendly, it could simply skip notifying the new viewer in line about an update, rendering that and all subsequent viewers impotent.

To address these problems, the Clipboard Format Listener mechanism was added in Windows Vista.  This works in much the same way as the clipboard viewer functionality except in this case, Windows maintains the list of listeners, instead of depending on each application to preserve a chain.

If an application wishes to become a clipboard format listener, it calls the AddClipboardFormatListener function and passes in a handle to its window.  After that, its window message handler will receive WM_CLIPBOARDUPDATE messages.  When the application is ready to exit or no longer wishes to receive notifications, it can call RemoveClipboardFormatListener.

Now that we’ve covered the ways to register a viewer/listener, let’s take a look at how to identify them using the debugger.  First, you’ll need to identify a process in the session you are interested in checking for clipboard monitors.  It can be any win32 process in that session – we just need to use it to locate a pointer to the Window Station.  In this case, I’ll use the Notepad window I used in part 1:

kd> !process 0 0 notepad.exe
PROCESS fffff980366ecb30
    SessionId: 1  Cid: 0374    Peb: 7fffffd8000  ParentCid: 0814
    DirBase: 1867e000  ObjectTable: fffff9803d28ef90  HandleCount:  52.
    Image: notepad.exe

If you are doing this in a live kernel debug, you’ll need to change context into the process interactively (using .process /I <address> then hit g and wait for the debugger to break back in).  Now DT the process address as an _EPROCESS and look for the Win32Process field:

kd> dt _EPROCESS fffff980366ecb30 Win32Process
   +0x258 Win32Process : 0xfffff900`c18c0ce0 Void

Now DT the Win32Process address as a win32k!tagPROCESSINFO and identify the rpwinsta value:

kd> dt win32k!tagPROCESSINFO  0xfffff900`c18c0ce0 rpwinsta
   +0x258 rpwinsta : 0xfffff980`0be2af60 tagWINDOWSTATION

This is our Window Station.  Dump it using dt:

kd> dt 0xfffff980`0be2af60 tagWINDOWSTATION
   +0x000 dwSessionId      : 1
   +0x008 rpwinstaNext     : (null) 
   +0x010 rpdeskList       : 0xfffff980`0c5e2f20 tagDESKTOP
   +0x018 pTerm            : 0xfffff960`002f5560 tagTERMINAL
   +0x020 dwWSF_Flags      : 0
   +0x028 spklList         : 0xfffff900`c192cf80 tagKL
   +0x030 ptiClipLock      : (null) 
   +0x038 ptiDrawingClipboard : (null) 
   +0x040 spwndClipOpen    : (null) 
   +0x048 spwndClipViewer  : 0xfffff900`c1a4ca70 tagWND
   +0x050 spwndClipOwner   : 0xfffff900`c1a3ef70 tagWND
   +0x058 pClipBase        : 0xfffff900`c5512fa0 tagCLIP
   +0x060 cNumClipFormats  : 4
   +0x064 iClipSerialNumber : 0x16
   +0x068 iClipSequenceNumber : 0xc1
   +0x070 spwndClipboardListener : 0xfffff900`c1a53440 tagWND
   +0x078 pGlobalAtomTable : 0xfffff980`0bd56c70 Void
   +0x080 luidEndSession   : _LUID
   +0x088 luidUser         : _LUID
   +0x090 psidUser         : 0xfffff900`c402afe0 Void

Note the spwndClipViewer, spwndClipboardListener, and spwndClipOwner fields.  spwndClipViewer is the most-recently-registered window in the clipboard viewer chain.  Similarly, spwndClipboardListener is the most recent listener in our Clipboard Format Listener list.  spwndClipOwner is the window that set the content in the clipboard.

Given the window, it is just a few steps to determine the process.  This would work for spwndClipViewer, spwndClipboardListener, and spwndClipOwner.  First, dt the value as a tagWND.  We’ll use the spwndClipViewer for this demonstration:

kd> dt 0xfffff900`c1a4ca70 tagWND
   +0x000 head             : _THRDESKHEAD
   +0x028 state            : 0x40020008
   +0x028 bHasMeun         : 0y0
   +0x028 bHasVerticalScrollbar : 0y0

We only care about the head – so since it is at offset 0, dt the same address as a _THRDESKHEAD:

kd> dt 0xfffff900`c1a4ca70 _THRDESKHEAD
   +0x000 h                : 0x00000000`000102ae Void
   +0x008 cLockObj         : 6
   +0x010 pti              : 0xfffff900`c4f26c20 tagTHREADINFO
   +0x018 rpdesk           : 0xfffff980`0c5e2f20 tagDESKTOP
   +0x020 pSelf            : 0xfffff900`c1a4ca70  "???"

Now, dt the address in pti as a tagTHREADINFO:

kd> dt 0xfffff900`c4f26c20 tagTHREADINFO
   +0x000 pEThread         : 0xfffff980`0ef6cb10 _ETHREAD
   +0x008 RefCount         : 1
   +0x010 ptlW32           : (null) 
   +0x018 pgdiDcattr       : 0x00000000`000f0d00 Void

Here, we only care about the value of pEThread, which we can pass to !thread:

kd> !thread 0xfffff980`0ef6cb10 
THREAD fffff9800ef6cb10  Cid 087c.07ec  Teb: 000007fffffde000 Win32Thread: fffff900c4f26c20 WAIT: (WrUserRequest) UserMode Non-Alertable
    fffff9801c01efe0  SynchronizationEvent
Not impersonating
DeviceMap                 fffff980278a0fc0
Owning Process            fffff98032e18b30       Image:         viewer02.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      5435847        Ticks: 33 (0:00:00:00.515)
Context Switch Count      809            IdealProcessor: 0                 LargeStack
UserTime                  00:00:00.000
KernelTime                00:00:00.062
Win32 Start Address 0x000000013f203044
Stack Init fffff880050acdb0 Current fffff880050ac6f0
Base fffff880050ad000 Limit fffff880050a3000 Call 0
Priority 11 BasePriority 8 UnusualBoost 0 ForegroundBoost 2 IoPriority 2 PagePriority 5
Child-SP          RetAddr           : Args to Child                                                           : Call Site
fffff880`050ac730 fffff800`01488f32 : fffff980`0ef6cb10 fffff980`0ef6cb10 fffff980`0ef6cbd0 00000000`0000000b : nt!KiSwapContext+0x7a
fffff880`050ac870 fffff800`0148b74f : 00000000`00000021 fffff800`015b1975 00000000`00000000 00000000`6d717355 : nt!KiCommitThreadWait+0x1d2
fffff880`050ac900 fffff960`000dc8e7 : fffff900`c4f26c00 fffff880`0000000d fffff900`c40a6f01 fffff960`000dcc00 : nt!KeWaitForSingleObject+0x19f
fffff880`050ac9a0 fffff960`000dc989 : 00000000`00000000 00000000`00000000 00000000`00000001 00000000`00000000 : win32k!xxxRealSleepThread+0x257
fffff880`050aca40 fffff960`000dafc0 : fffff900`c4f26c20 fffff880`050acca0 00000000`00000001 00000000`00000000 : win32k!xxxSleepThread+0x59
fffff880`050aca70 fffff960`000db0c5 : 00000000`00000000 fffff800`000025ff 00000000`00000000 fffff980`ffffffff : win32k!xxxRealInternalGetMessage+0x7dc
fffff880`050acb50 fffff960`000dcab5 : 00000000`021503bd 00000000`021503bd fffff880`050acbc8 fffff800`01482ed3 : win32k!xxxInternalGetMessage+0x35
fffff880`050acb90 fffff800`01482ed3 : fffff980`0ef6cb10 00000000`00000000 00000000`00000020 fffff980`1c394e60 : win32k!NtUserGetMessage+0x75
fffff880`050acc20 00000000`77929e6a : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x13 (TrapFrame @ fffff880`050acc20)
00000000`002ffb18 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x77929e6a

As you can see, we have a clipboard viewer registered from process viewer02.exe.  Because of viewer’s process-maintained chain architecture, it isn’t easy to see the next process in the chain.  However, we can do this for clipboard listeners.  Let’s look back at our window station:

kd> dt 0xfffff980`0be2af60 tagWINDOWSTATION
   +0x000 dwSessionId      : 1
   +0x008 rpwinstaNext     : (null) 
   +0x010 rpdeskList       : 0xfffff980`0c5e2f20 tagDESKTOP
   +0x018 pTerm            : 0xfffff960`002f5560 tagTERMINAL
   +0x020 dwWSF_Flags      : 0
   +0x028 spklList         : 0xfffff900`c192cf80 tagKL
   +0x030 ptiClipLock      : (null) 
   +0x038 ptiDrawingClipboard : (null) 
   +0x040 spwndClipOpen    : (null) 
   +0x048 spwndClipViewer  : 0xfffff900`c1a4ca70 tagWND
   +0x050 spwndClipOwner   : 0xfffff900`c1a3ef70 tagWND
   +0x058 pClipBase        : 0xfffff900`c5512fa0 tagCLIP
   +0x060 cNumClipFormats  : 4
   +0x064 iClipSerialNumber : 0x16
   +0x068 iClipSequenceNumber : 0xc1
   +0x070 spwndClipboardListener : 0xfffff900`c1a53440 tagWND
   +0x078 pGlobalAtomTable : 0xfffff980`0bd56c70 Void
   +0x080 luidEndSession   : _LUID
   +0x088 luidUser         : _LUID
   +0x090 psidUser         : 0xfffff900`c402afe0 Void

If we dt the spwndClipboardListener, there is a field that shows the next listener named spwndClipboardListenerNext:

kd> dt 0xfffff900`c1a53440 tagWND spwndClipboardListenerNext
   +0x118 spwndClipboardListenerNext : 0xfffff900`c1a50080 tagWND

When you reach the last clipboard format listener’s tagWND, its spwndClipboardListenerNext value will be null:

kd> dt 0xfffff900`c1a50080 tagWND spwndClipboardListenerNext
   +0x118 spwndClipboardListenerNext : (null)

Using this window address, we can go through the same steps as above to identify this listener’s process name.  As mentioned earlier, since tagWND is a kernel structure, the OS is maintaining these spwndClipboardListener/spwndClipboardListenerNext pointers, so they aren’t susceptible to the chain problems of clipboard viewers.

That wraps up our clipboard coverage. I hope you found it informative.  Want to learn more about monitoring the clipboard?  This MSDN article is a good resource.

-Matt Burrough

How the Clipboard Works, Part 1

This is an archive post of content I wrote for the NTDebugging Blog on MSDN. Since the MSDN blogs are being retired, I’m transferring my posts here so they aren’t lost. The post has been back-dated to its original publication date.

Recently, I had the opportunity to debug the clipboard in Windows, and I thought I’d share some of the things I learned.  The clipboard is one of those parts of Windows that many of us use dozens (hundreds?) of times a day and don’t really think about.  Before working on this case, I had never even considered how it works under the hood.  It turns out that there’s more to it than you might think.  In this first article, I’ll describe how applications store different types of data to the clipboard and how it is retrieved.  My next post will describe how applications can hook the clipboard to monitor it for changes.  In both, I’ll include debug notes to show you how to access the data from the debugger.

Let’s start by discussing clipboard formats.  A clipboard format is used to describe what type of data is placed on the clipboard.  There are a number of predefined standard formats that an application can use, such as bitmap, ANSI text, Unicode text, and TIFF.  Windows also allows an application to specify its own formats. For example, a word processor may want to register a format that includes text, formatting, and images.  Of course – this leads to one problem – what happens if you want to copy from your word processor and paste into Notepad, which doesn’t understand all of the formatting and pictures?

The answer is to allow multiple formats to be stored in the clipboard at one time.  When I used to think of the clipboard, I thought of it as a single object (“my text” or “my image”) but in reality, the clipboard usually has my data in several different forms so when I paste, the destination program gets a format it can use.

So how does this data end up on the clipboard?  Simple – an application first takes ownership of the clipboard via the OpenClipboard function.   Once it has done that, it can empty the clipboard with EmptyClipboard.  Finally, it is ready to place data on the clipboard using SetClipboardData.  SetClipboardData takes two parameters – the first is the identifier of one of the clipboard formats we discussed above.  The second is a handle to the memory containing the data in that format.  The application can continue to call SetClipboardData for each of the various formats it wishes to provide going from best to worst (since the destination application will select the first format in the list it recognizes).  To make things easier for the developer, Windows will automatically provide converted formats for some clipboard format types.  Once the program is done, it calls CloseClipboard.

When a user hits paste, the destination application will call OpenClipboard and one of these functions to determine what data format(s) are available: IsClipboardFormatAvailable, GetPriorityClipboardFormat, or EnumClipboardFormats.  Assuming the application finds a format it can use, it will then call GetClipboardData with the desired format identifier as a parameter to get a handle to the data.  Finally, it will call CloseClipboard.

Now let’s take a look at how you can find what data being written to the clipboard using the debugger.  (Note that all of my notes are taken from a Win7/2008 R2 system – so things might vary slightly in different versions of the OS.)   Since the clipboard is part of Win32k.sys, you’ll need to use a kernel debugger.  I like to use win32k!InternalSetClipboardData+0xe4 as a breakpoint.  The nice thing about this offset is that it is right after we’ve populated the RDI register with data from SetClipboardData in a structure known as tagCLIP.

kd> u win32k!InternalSetClipboardData+0xe4-c L5
fffff960`0011e278 894360          mov     dword ptr [rbx+60h],eax
fffff960`0011e27b 8937            mov     dword ptr [rdi],esi
fffff960`0011e27d 4c896708        mov     qword ptr [rdi+8],r12
fffff960`0011e281 896f10          mov     dword ptr [rdi+10h],ebp
fffff960`0011e284 ff15667e1900    call    qword ptr [win32k!_imp_PsGetCurrentProcessWin32Process (fffff960`002b60f0)]

kd> dt win32k!tagCLIP
   +0x000 fmt              : Uint4B
   +0x008 hData            : Ptr64 Void
   +0x010 fGlobalHandle    : Int4B

Here’s what a call to SetClipboardData from Notepad looks like:

kd> k
Child-SP          RetAddr           Call Site
fffff880`0513a940 fffff960`0011e14f win32k!InternalSetClipboardData+0xe4
fffff880`0513ab90 fffff960`000e9312 win32k!SetClipboardData+0x57
fffff880`0513abd0 fffff800`01482ed3 win32k!NtUserSetClipboardData+0x9e
fffff880`0513ac20 00000000`7792e30a nt!KiSystemServiceCopyEnd+0x13
00000000`001dfad8 00000000`7792e494 USER32!ZwUserSetClipboardData+0xa
00000000`001dfae0 000007fe`fc5b892b USER32!SetClipboardData+0xdf
00000000`001dfb20 000007fe`fc5ba625 COMCTL32!Edit_Copy+0xdf
00000000`001dfb60 00000000`77929bd1 COMCTL32!Edit_WndProc+0xec9
00000000`001dfc00 00000000`779298da USER32!UserCallWinProcCheckWow+0x1ad
00000000`001dfcc0 00000000`ff5110bc USER32!DispatchMessageWorker+0x3b5
00000000`001dfd40 00000000`ff51133c notepad!WinMain+0x16f
00000000`001dfdc0 00000000`77a2652d notepad!DisplayNonGenuineDlgWorker+0x2da
00000000`001dfe80 00000000`77b5c521 kernel32!BaseThreadInitThunk+0xd
00000000`001dfeb0 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

So here, we can dt RDI as a tagCLIP to see what was written:

kd> dt win32k!tagCLIP @rdi
   +0x000 fmt              : 0xd
   +0x008 hData            : 0x00000000`00270235 Void
   +0x010 fGlobalHandle    : 0n1

Fmt is the clipboard format.  0xd is 13, which indicates this data is Unicode text.  We can’t just ‘du’ the value in hData, however, because this is a handle, not a direct pointer to the data.  So now we need to look up the handle.  To do that, we need to look at a win32k global structure – gSharedInfo:

kd> ?win32k!gSharedInfo
Evaluate expression: -7284261440224 = fffff960`002f3520 
kd> dt win32k!tagSHAREDINFO fffff960`002f3520
   +0x000 psi              : 0xfffff900`c0980a70 tagSERVERINFO
   +0x008 aheList          : 0xfffff900`c0800000 _HANDLEENTRY
   +0x010 HeEntrySize      : 0x18
   +0x018 pDispInfo        : 0xfffff900`c0981e50 tagDISPLAYINFO
   +0x020 ulSharedDelta    : 0
   +0x028 awmControl       : [31] _WNDMSG
   +0x218 DefWindowMsgs    : _WNDMSG
   +0x228 DefWindowSpecMsgs : _WNDMSG

aheList in gSharedInfo contains an array of handle entries, and the last 2 bytes of hData multiplied by the size of a handle entry the address of our handle entry:

kd> ?0x00000000`00270235 & FFFF
Evaluate expression: 565 = 00000000`00000235
kd> ??sizeof(win32k!_HANDLEENTRY)
unsigned int64 0x18
kd> ? 0xfffff900`c0800000 + (0x235*0x18)
Evaluate expression: -7693351766792 = fffff900`c08034f8

kd> dt win32k!_HANDLEENTRY fffff900`c08034f8
   +0x000 phead            : 0xfffff900`c0de0fb0 _HEAD
   +0x008 pOwner           : (null) 
   +0x010 bType            : 0x6 ''
   +0x011 bFlags           : 0 ''
   +0x012 wUniq            : 0x27

If we look in phead at offset 14, we’ll get our data:

kd> du fffff900`c0de0fb0 + 0x14
fffff900`c0de0fc4  "Hi NTDebugging readers!"

Let’s consider one other scenario.  I copied some text out of Wordpad, and a number of SetClipboardData calls were made to accommodate different formats. The Unicode format entry looks like this:

Breakpoint 0 hit
fffff960`0011e284 ff15667e1900    call    qword ptr [win32k!_imp_PsGetCurrentProcessWin32Process (fffff960`002b60f0)]
kd> dt win32k!tagCLIP @rdi
   +0x000 fmt              : 0xd
   +0x008 hData            : (null) 
   +0x010 fGlobalHandle    : 0n0

hData is null!  Why is that?  It turns out that the clipboard allows an application to pass in null to SetClipboardData for a given format.  This indicates that the application can provide the data in that format, but is deferring doing so until it is actually needed.  Sure enough, if I paste the text into Notepad, which needs the text in Unicode, Windows sends a WM_RENDERFORMAT message to the WordPad window, and WordPad provides the data in the new format.  Of course, if the application exits before populating all of its formats, Windows needs all of the formats rendered.  In this case, Windows will send the WM_RENDERALLFORMATS message so other applications can use the clipboard data after the source application has exited.

That’s all for now.  Next time we’ll look at how applications can monitor the clipboard for changes using two hooks.  If you want to know more about using the clipboard in your code, this is a great reference.

-Matt Burrough

Part 2 of this article can be found here: How the Clipboard Works, Part 2

Debugging Backwards: Proving root cause

This is an archive post of content I wrote for the NTDebugging Blog on MSDN. Since the MSDN blogs are being retired, I’m transferring my posts here so they aren’t lost. The post has been back-dated to its original publication date.

Matt Burrough here again.  On rare occasions when debugging, we’ll actually know (or strongly suspect) what the root cause of a problem is at the beginning of our analysis – but we still need to investigate to confirm our assertion.  The following is a case study for an issue I worked on recently where the print spooler was hanging.

This customer had recently upgraded their print servers from Windows 2003 to Windows 2008 R2.  After the upgrade, the spooler would frequently go unresponsive, and no jobs could be printed.  Rebooting the server provided some relief, but the problem would reoccur as the jobs started coming in again.

One of my peers had completed an initial analysis of a user mode memory dump of the spooler process, and found that spooler seemed to be blocked waiting on PrintIsolationHost.exe.  For those not familiar with recent changes to the print spooler’s design – Print Isolation Host was added in Windows 7/2008 R2 as a way to isolate print drivers from each other and from the spooler process so a crash in one driver doesn’t take down the entire printing environment.  Given the large number of print drivers found on some print servers, this can be a great help for stability and availability of the spooler.  See http://blogs.technet.com/b/askperf/archive/2009/10/08/windows-7-windows-server-2008-r2-print-driver-isolation.aspx if you would like more details on Print Isolation Host.

Unfortunately for my team mate, the data collected did not include dumps of Print Isolation Host, so he requested that the next time the problem happened that both spooler and PrintIsolationHost dumps would be gathered.  The customer had configured his server for the “Isolated” Print Isolation Host option during troubleshooting, which places each driver in its own process.  (The default is shared which places all drivers in one PrintIsolationHost.exe instance.  Driver isolation is configured using the Print Management Console.)

The next morning, the newly requested data came in, and since the problem was urgent, I began looking at the new dumps immediately.  This dataset included both a spoolsv.exe dump, as well as nearly two dozen PrintIsolationHost dumps!  I knew from the past analysis that I should start with the PrintIsolationHost data – but where to begin?  In order to triage the dumps, I wrote a small batch file to open each dump (luckily, they were sequentially numbered), write the call stacks in each thread to a file, and close the log.  On every iteration, the script created a new cmd.txt file, which contained a set of commands that were passed to the debugger.  This allowed me to name the debugger output files sequentially with names that matched their dump (e.g. PrintIsolationHost1.txt contained output from PrintIsolationHost1.dmp).

set x=1
echo .reload > c:\data\cmd.txt
echo .logopen c:\data\PrintIsolationHost%x%.txt  >> c:\data\cmd.txt
echo ~*kc >> c:\data\cmd.txt
echo .logclose >> c:\data\cmd.txt
echo qq >> c:\data\cmd.txt
c:\debuggers\kd.exe -cf c:\data\cmd.txt -z "c:\data\PrintIsolationHost%x%.DMP"
set /A x+=1
IF %x% LEQ 23 GOTO Top

Now that I had a directory full of text files, I used my favorite differencing tool to compare the stacks in each text file.  I used the first PrintIsolationHost file as a reference.  It only had four stacks, and these were common to all of the other files:

.  0  Id: 40c.1c28 Suspend: 0 Teb: 000007ff`fffde000 Unfrozen
Call Site

   1  Id: 40c.23c0 Suspend: 0 Teb: 000007ff`fffdb000 Unfrozen
Call Site

   2  Id: 40c.2598 Suspend: 0 Teb: 000007ff`fffd3000 Unfrozen
Call Site

   3  Id: 40c.820 Suspend: 0 Teb: 000007ff`fffd9000 Unfrozen
Call Site

I was able to rule out a number of other PrintIsolationHost instances that were either identical to this one (aside from the process/thread IDs and Tebs), or which just had one or two additional idle worker threads (like thread 3 above).

Things got interesting when I looked at two of the PrintIsolationHost dumps.  Both had these two stacks not found in any other the other dumps (note that I do not have symbols for the 3rd-party print processor ProseWarePrintProcessor):

   2  Id: 20a4.1328 Suspend: 0 Teb: 000007ff`fffda000 Unfrozen
Call Site
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for ProseWarePrintProcessor.dll -

   6  Id: 20a4.1668 Suspend: 0 Teb: 000007ff`fffac000 Unfrozen
Call Site

Interesting.  Thread 6 is running the DllMain code for the ProseWarePrintProcessor DLL which holds the loader lock. It is waiting on a critical section.  Meanwhile, thread 2 is waiting on the loader lock.  So who holds the critical section that thread 6 wants?  First, let’s find what lock thread 6 wants:

0:006> kn
 # Child-SP          RetAddr           Call Site
00 00000000`0104eb18 00000000`777fe4e8 ntdll!ZwWaitForSingleObject+0xa
01 00000000`0104eb20 00000000`777fe3db ntdll!RtlpWaitOnCriticalSection+0xe8
02 00000000`0104ebd0 00000000`750c5d6b ntdll!RtlEnterCriticalSection+0xd1
03 00000000`0104ec00 00000000`750c6256 ProseWarePrintProcessor+0xabf
04 00000000`0104ec30 00000000`750c7015 ProseWarePrintProcessor+0xfaa
05 00000000`0104f090 00000000`777dc76c ProseWarePrintProcessor+0x1d69
06 00000000`0104f1f0 00000000`777dc42f ntdll!LdrpInitializeThread+0x17c
07 00000000`0104f2f0 00000000`777dc32e ntdll!LdrpInitialize+0x9f
08 00000000`0104f360 00000000`00000000 ntdll!LdrInitializeThunk+0xe
0:006> .frame /c /r 2
02 00000000`0104ebd0 00000000`750c5d6b ntdll!RtlEnterCriticalSection+0xd1
rax=0000000300d1001a rbx=00000000750ca330 rcx=00000000001d0000
rdx=0000000000000040 rsi=0000000000000001 rdi=0000000000000004
rip=00000000777fe3db rsp=000000000104ebd0 rbp=00000000ff000000
 r8=00000000002c6a00  r9=00000000002c6a10 r10=0000000000000073
r11=0000000000000001 r12=000007fffffd6000 r13=00000000778e2660
r14=0000000000000001 r15=000000007780a280
iopl=0         nv up ei pl zr na po nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246
00000000`777fe3db 83f801          cmp     eax,1
0:006> ub 00000000`750c5d6b      <----------- Let's see what gets passed to RtlEnterCriticalSection
00000000`750c5d4e f9              stc
00000000`750c5d4f ff743f33        push    qword ptr [rdi+rdi+33h]
00000000`750c5d53 d2ff            sar     bh,cl
00000000`750c5d55 15aeb4ffff      adc     eax,0FFFFB4AEh
00000000`750c5d5a 85c0            test    eax,eax
00000000`750c5d5c 7533            jne     ProseWarePrintProcessor+0xae5 (00000000`750c5d91)
00000000`750c5d5e 488d0dcb450000  lea     rcx,[ProseWarePrintProcessor+0x5084 (00000000`750ca330)] <----- Critical section is in rcx
00000000`750c5d65 ff15bdb3ffff    call    qword ptr [ProseWarePrintProcessor+0x1128 (00000000`750c1128)]
0:006> u ntdll!RtlEnterCriticalSection
00000000`77802fc0 fff3            push    rbx
00000000`77802fc2 4883ec20        sub     rsp,20h
00000000`77802fc6 f00fba710800    lock btr dword ptr [rcx+8],0
00000000`77802fcc 488bd9          mov     rbx,rcx   <------------------ Critical section is saved in RBX.  RBX isn't modified between here and our current position
00000000`77802fcf 0f83e9b1ffff    jae     ntdll!RtlEnterCriticalSection+0x31 (00000000`777fe1be)
00000000`77802fd5 65488b042530000000 mov   rax,qword ptr gs:[30h]
00000000`77802fde 488b4848        mov     rcx,qword ptr [rax+48h]
00000000`77802fe2 c7430c01000000  mov     dword ptr [rbx+0Ch],1
0:006> r rbx
Last set context:
rbx=00000000750ca330  <----- This is the address of the critical section thread 6 is waiting for

Now let’s look at the threads in this process, and the held locks:

0:002> ~
#  0  Id: 20a4.180c Suspend: 0 Teb: 000007ff`fffde000 Unfrozen
   1  Id: 20a4.2294 Suspend: 0 Teb: 000007ff`fffdc000 Unfrozen
.  2  Id: 20a4.1328 Suspend: 0 Teb: 000007ff`fffda000 Unfrozen
   3  Id: 20a4.1c34 Suspend: 0 Teb: 000007ff`fffd8000 Unfrozen
   4  Id: 20a4.5bc Suspend: 0 Teb: 000007ff`fffd4000 Unfrozen
   5  Id: 20a4.3c4 Suspend: 0 Teb: 000007ff`fffae000 Unfrozen
   6  Id: 20a4.1668 Suspend: 0 Teb: 000007ff`fffac000 Unfrozen
0:002> !locks

CritSec ntdll!LdrpLoaderLock+0 at 00000000778e7490
WaiterWoken        No
LockCount          2
RecursionCount     1
OwningThread       1668
EntryCount         0
ContentionCount    2
*** Locked

CritSec ProseWarePrintProcessor+5084 at 00000000750ca330
WaiterWoken        No
LockCount          1
RecursionCount     2
OwningThread       1328
EntryCount         0
ContentionCount    1
*** Locked

Scanned 201 critical sections

That’s not good!  We can see that thread 6 indeed owns the loader lock, which thread 2 is waiting for.  But thread 2 owns the ProseWarePrintProcessor lock – and thread 6 is waiting for it!  This is a classic deadlock.  In fact, Raymond Chen even described this on his blog.  More information about LdrInitialize can be found here.

Deadlock chain
Deadlock chain

So we know that there is a deadlock in the Print Isolation Host, but why exactly does this cause spooler to hang?  Here’s where we work backwards.  We know that thread 6 is handling DLL initialization, but what is thread 2 doing?  From the stack we can see it is handling an RPC request that called into ProseWarePrintProcessor.  Let’s determine who called into this thread.

0:002> kn
 # Child-SP          RetAddr           Call Site
00 00000000`0085dd88 00000000`777fe4e8 ntdll!ZwWaitForSingleObject+0xa
01 00000000`0085dd90 00000000`777fe3db ntdll!RtlpWaitOnCriticalSection+0xe8
02 00000000`0085de40 00000000`777db9e7 ntdll!RtlEnterCriticalSection+0xd1
03 00000000`0085de70 000007fe`fd963706 ntdll!LdrLockLoaderLock+0x6d
04 00000000`0085deb0 00000000`750c58f3 KERNELBASE!GetModuleFileNameW+0x96
05 00000000`0085df10 00000000`750c5d77 ProseWarePrintProcessor!InstallPrintProcessor+0x647
06 00000000`0085e380 00000000`750c51d9 ProseWarePrintProcessor!InstallPrintProcessor+0xacb
07 00000000`0085e3b0 000007fe`f96766b4 ProseWarePrintProcessor!ControlPrintProcessor+0x25
08 00000000`0085e3e0 000007fe`fe2f23d5 PrintIsolationProxy!sandbox::PrintProcessor::ControlPrintProcessor+0x34
09 00000000`0085e420 000007fe`fe39b68e rpcrt4!Invoke+0x65
0a 00000000`0085e470 000007fe`fe2f48d6 rpcrt4!Ndr64StubWorker+0x61b
0b 00000000`0085ea30 000007fe`fdd50883 rpcrt4!NdrStubCall3+0xb5
0c 00000000`0085ea90 000007fe`fdd50ccd ole32!CStdStubBuffer_Invoke+0x5b
0d 00000000`0085eac0 000007fe`fdd50c43 ole32!SyncStubInvoke+0x5d
0e 00000000`0085eb30 000007fe`fdc0a4f0 ole32!StubInvoke+0xdb
0f 00000000`0085ebe0 000007fe`fdd514d6 ole32!CCtxComChnl::ContextInvoke+0x190
10 00000000`0085ed70 000007fe`fdd5122b ole32!AppInvoke+0xc2
11 00000000`0085ede0 000007fe`fdd4fd6d ole32!ComInvokeWithLockAndIPID+0x52b
12 00000000`0085ef70 000007fe`fe2e50f4 ole32!ThreadInvoke+0x30d
13 00000000`0085f010 000007fe`fe2e4f56 rpcrt4!DispatchToStubInCNoAvrf+0x14
14 00000000`0085f040 000007fe`fe2e775b rpcrt4!RPC_INTERFACE::DispatchToStubWorker+0x146
15 00000000`0085f160 000007fe`fe2e769b rpcrt4!RPC_INTERFACE::DispatchToStub+0x9b
16 00000000`0085f1a0 000007fe`fe2e7632 rpcrt4!RPC_INTERFACE::DispatchToStubWithObject+0x5b
17 00000000`0085f220 000007fe`fe2e532d rpcrt4!LRPC_SCALL::DispatchRequest+0x422
18 00000000`0085f300 000007fe`fe302e7f rpcrt4!LRPC_SCALL::HandleRequest+0x20d
19 00000000`0085f430 000007fe`fe302a35 rpcrt4!LRPC_ADDRESS::ProcessIO+0x3bf
1a 00000000`0085f570 00000000`777cb68b rpcrt4!LrpcIoComplete+0xa5
1b 00000000`0085f600 00000000`777cfeff ntdll!TppAlpcpExecuteCallback+0x26b
1c 00000000`0085f690 00000000`776a652d ntdll!TppWorkerThread+0x3f8
1d 00000000`0085f990 00000000`777dc521 kernel32!BaseThreadInitThunk+0xd
1e 00000000`0085f9c0 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

I know that the code in frame 19 deals with processing the RPC and has a record of the calling process’ PID and TID.  In fact, from a bit of code review, I know that at this portion of the code, we get back a value that contains a ntdll!_CLIENT_ID structure at offset 8:

000007fe`fe302ba6 ff151c050a00    call    qword ptr [rpcrt4!_imp_AlpcGetMessageFromCompletionList (000007fe`fe3a30c8)]
000007fe`fe302bac 4885c0          test    rax,rax
000007fe`fe302baf 0f84d1020000    je      rpcrt4!LRPC_ADDRESS::ProcessIO+0x3c6 (000007fe`fe302e86)
000007fe`fe302bb5 4c8bac2488000000 mov     r13,qword ptr [rsp+88h]
000007fe`fe302bbd 41bc01000000    mov     r12d,1
000007fe`fe302bc3 33d2            xor     edx,edx
000007fe`fe302bc5 488bf8          mov     rdi,rax
000007fe`fe302bc8 41bfff000000    mov     r15d,0FFh

Reviewing the assembly, from ProcessIO+0xe6 to where we are now (ProcessIO+0x3bf), we don’t modify rdi again – and rdi is nonvolatile – so if we switch to that frame and check out rdi+8, we’ll know who called this thread!

0:002> .frame /c /r 19
19 00000000`0085f430 000007fe`fe302a35 rpcrt4!LRPC_ADDRESS::ProcessIO+0x3bf
rax=0000e5a47e7646ad rbx=0000000000000001 rcx=00000000750ca330
rdx=0000000000000000 rsi=00000000002b8aa0 rdi=00000000002c4a80
rip=000007fefe302e7f rsp=000000000085f430 rbp=0000000000ecff90
 r8=000000000085e2d8  r9=0000000000000002 r10=0000000000000000
r11=0000000000000246 r12=0000000000ec8bf0 r13=0000000000000000
r14=0000000000000000 r15=0000000000ec8080
iopl=0         nv up ei pl zr na po nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246
000007fe`fe302e7f 4c8d357ad1fbff  lea     r14,[rpcrt4!COMMON_ResubmitListen <PERF> (rpcrt4+0x0) (000007fe`fe2c0000)]
0:002> dt _CLIENT_ID rdi+8
   +0x000 UniqueProcess    : 0x00000000`00002694 Void
   +0x008 UniqueThread     : 0x00000000`000005e8 Void
0:002> ? 0x00000000`00002694
Evaluate expression: 9876 = 00000000`00002694
0:002> ? 0x00000000`000005e8
Evaluate expression: 1512 = 00000000`000005e8

So now we know that the caller was process 0x2694 and thread 0x5e8, or 9876 and 1512 in decimal, respectively.  Our current process (PrintIsolationHost.exe) is PID 20a4 (see above ~ output), or 8356 decimal.  So who is 9876?  I happen to have a Process Listing from the dump of the data collection:

[0]  0 64 9876 spoolsv.exe     Svcs:  Spooler
     Command Line: C:\Windows\System32\spoolsv.exe
[0]  0 64 8356 PrintIsolationHost.exe 
     Command Line: C:\Windows\system32\PrintIsolationHost.exe -Embedding

Okay, so I know the caller is thread 5e8 in spooler.  Loading up the spooler dump, what is that thread doing?

0:000> ~~[5e8]s
00000000`77801b6a c3              ret
# Child-SP          RetAddr           Call Site
00 00000000`0649d898 000007fe`fe2fa776 ntdll!NtAlpcSendWaitReceivePort+0xa
01 00000000`0649d8a0 000007fe`fe2f4e42 rpcrt4!LRPC_CCALL::SendReceive+0x156
02 00000000`0649d960 000007fe`fdd528c0 rpcrt4!I_RpcSendReceive+0x42
03 00000000`0649d990 000007fe`fdd5282f ole32!ThreadSendReceive+0x40
04 00000000`0649d9e0 000007fe`fdd5265b ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+0xa3
05 00000000`0649da80 000007fe`fdc0daaa ole32!CRpcChannelBuffer::SendReceive2+0x11b
06 00000000`0649dc40 000007fe`fdc0da0c ole32!CAptRpcChnl::SendReceive+0x52
07 00000000`0649dd10 000007fe`fdd5205d ole32!CCtxComChnl::SendReceive+0x68
08 00000000`0649ddc0 000007fe`fe39b949 ole32!NdrExtpProxySendReceive+0x45
09 00000000`0649ddf0 000007fe`fdd521d0 rpcrt4!NdrpClientCall3+0x2e2
0a 00000000`0649e0b0 000007fe`fdc0d8a2 ole32!ObjectStublessClient+0x11d
0b 00000000`0649e440 000007fe`fb059070 ole32!ObjectStubless+0x42
0c 00000000`0649e490 000007fe`fb057967 localspl!sandbox::PrintProcessorExecuteObserver::ControlPrintProcessor+0x10
0d 00000000`0649e4c0 000007fe`fb055e27 localspl!sandbox::PrintProcessorAdapterImpl::ControlPrintProcessor+0x27
0e 00000000`0649e4f0 000007fe`faff7392 localspl!sandbox::PrintProcessorAdapter::ControlPrintProcessor+0x1b
0f 00000000`0649e520 000007fe`faff8a0a localspl!DeleteJob+0x1ce
10 00000000`0649eae0 00000000`ff41fe25 localspl!SplSetJob+0x49e
11 00000000`0649eb80 000007fe`f9683603 spoolsv!SetJobW+0x25
12 00000000`0649ebc0 00000000`61001ce1 spoolss!SetJobW+0x1f
13 00000000`0649ec00 00000000`61001d7d Contoso!InitializePrintMonitor+0x781
14 00000000`0649ec40 000007fe`faffa674 Contoso!InitializePrintMonitor+0x81d
15 00000000`0649ec70 00000000`ff41c9c7 localspl!SplEndDocPrinter+0x214
16 00000000`0649ecd0 00000000`ff403ba6 spoolsv!EndDocPrinter+0x1f
17 00000000`0649ed00 00000000`ff3fe772 spoolsv!YEndDocPrinter+0x22
18 00000000`0649ed30 000007fe`fe2f23d5 spoolsv!RpcEndDocPrinter+0x3e
19 00000000`0649ed60 000007fe`fe39b68e rpcrt4!Invoke+0x65
1a 00000000`0649edb0 000007fe`fe2dac40 rpcrt4!Ndr64StubWorker+0x61b
1b 00000000`0649f370 000007fe`fe2e50f4 rpcrt4!NdrServerCallAll+0x40
1c 00000000`0649f3c0 000007fe`fe2e4f56 rpcrt4!DispatchToStubInCNoAvrf+0x14
1d 00000000`0649f3f0 000007fe`fe2e5679 rpcrt4!RPC_INTERFACE::DispatchToStubWorker+0x146
1e 00000000`0649f510 000007fe`fe2e532d rpcrt4!LRPC_SCALL::DispatchRequest+0x149
1f 00000000`0649f5f0 000007fe`fe302e7f rpcrt4!LRPC_SCALL::HandleRequest+0x20d
20 00000000`0649f720 000007fe`fe302a35 rpcrt4!LRPC_ADDRESS::ProcessIO+0x3bf
21 00000000`0649f860 00000000`777cb68b rpcrt4!LrpcIoComplete+0xa5
22 00000000`0649f8f0 00000000`777cfeff ntdll!TppAlpcpExecuteCallback+0x26b
23 00000000`0649f980 00000000`776a652d ntdll!TppWorkerThread+0x3f8
24 00000000`0649fc80 00000000`777dc521 kernel32!BaseThreadInitThunk+0xd
25 00000000`0649fcb0 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

It’s calling into the print isolation host as we expect.  It looks like it is doing that to end a print job, based on an RPC call it received.  Using our same method as last time, let’s pull out the PID and TID he is responding to:

0:048> .frame /c /r 20
20 00000000`0649f720 000007fe`fe302a35 rpcrt4!LRPC_ADDRESS::ProcessIO+0x3bf
rax=000000000622986c rbx=0000000000000000 rcx=000000000622985c
rdx=000007fefe3a3c40 rsi=00000000027e5150 rdi=0000000008e6abd0
rip=000007fefe302e7f rsp=000000000649f720 rbp=0000000008e43480
 r8=0000000000000010  r9=0000000000000000 r10=000007fefe2c0000
r11=0000000000000002 r12=000000000512d8c0 r13=0000000000000000
r14=0000000000000000 r15=00000000102e3710
iopl=0         nv up ei pl zr na po nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246
000007fe`fe302e7f 4c8d357ad1fbff  lea     r14,[rpcrt4!COMMON_ResubmitListen <PERF> (rpcrt4+0x0) (000007fe`fe2c0000)]
0:048> dt _CLIENT_ID rdi+8
   +0x000 UniqueProcess    : 0x00000000`000020a4 Void
   +0x008 UniqueThread     : 0x00000000`00001c34 Void

Look at that, it’s doing work for a different thread back in our Print Isolation Host.  Switching back to that dump, what is thread 1c34 doing?

0:002> ~~[1c34]s
00000000`77801b6a c3  
0:003> kn
 # Child-SP          RetAddr           Call Site
00 00000000`00ebc098 000007fe`fe2fa776 ntdll!NtAlpcSendWaitReceivePort+0xa
01 00000000`00ebc0a0 000007fe`fe39cc74 rpcrt4!LRPC_CCALL::SendReceive+0x156
02 00000000`00ebc160 000007fe`fe39cf25 rpcrt4!NdrpClientCall3+0x244
03 00000000`00ebc420 000007fe`f9bfd878 rpcrt4!NdrClientCall3+0xf2
04 00000000`00ebc7b0 000007fe`f96845bf winspool!EndDocPrinter+0x15c
05 00000000`00ebc7f0 00000000`750c4102 spoolss!EndDocPrinter+0x2f
06 00000000`00ebc820 00000000`750c5013 ProseWarePrintProcessor+0x4102
07 00000000`00ebe620 000007fe`f9676be2 ProseWarePrintProcessor!PrintDocumentOnPrintProcessor+0xb3
08 00000000`00ebe650 000007fe`fe2f23d5 PrintIsolationProxy!sandbox::PrintProcessor::PrintDocThroughPrintProcessor+0x82
09 00000000`00ebe6b0 000007fe`fe39b68e rpcrt4!Invoke+0x65
0a 00000000`00ebe710 000007fe`fe2f48d6 rpcrt4!Ndr64StubWorker+0x61b
0b 00000000`00ebecd0 000007fe`fdd50883 rpcrt4!NdrStubCall3+0xb5
0c 00000000`00ebed30 000007fe`fdd50ccd ole32!CStdStubBuffer_Invoke+0x5b
0d 00000000`00ebed60 000007fe`fdd50c43 ole32!SyncStubInvoke+0x5d
0e 00000000`00ebedd0 000007fe`fdc0a4f0 ole32!StubInvoke+0xdb
0f 00000000`00ebee80 000007fe`fdd514d6 ole32!CCtxComChnl::ContextInvoke+0x190
10 00000000`00ebf010 000007fe`fdd5122b ole32!AppInvoke+0xc2
11 00000000`00ebf080 000007fe`fdd4fd6d ole32!ComInvokeWithLockAndIPID+0x52b
12 00000000`00ebf210 000007fe`fe2e50f4 ole32!ThreadInvoke+0x30d
13 00000000`00ebf2b0 000007fe`fe2e4f56 rpcrt4!DispatchToStubInCNoAvrf+0x14
14 00000000`00ebf2e0 000007fe`fe2e775b rpcrt4!RPC_INTERFACE::DispatchToStubWorker+0x146
15 00000000`00ebf400 000007fe`fe2e769b rpcrt4!RPC_INTERFACE::DispatchToStub+0x9b
16 00000000`00ebf440 000007fe`fe2e7632 rpcrt4!RPC_INTERFACE::DispatchToStubWithObject+0x5b
17 00000000`00ebf4c0 000007fe`fe2e532d rpcrt4!LRPC_SCALL::DispatchRequest+0x422
18 00000000`00ebf5a0 000007fe`fe302e7f rpcrt4!LRPC_SCALL::HandleRequest+0x20d
19 00000000`00ebf6d0 000007fe`fe302a35 rpcrt4!LRPC_ADDRESS::ProcessIO+0x3bf
1a 00000000`00ebf810 00000000`777cb68b rpcrt4!LrpcIoComplete+0xa5
1b 00000000`00ebf8a0 00000000`777cfeff ntdll!TppAlpcpExecuteCallback+0x26b
1c 00000000`00ebf930 00000000`776a652d ntdll!TppWorkerThread+0x3f8
1d 00000000`00ebfc30 00000000`777dc521 kernel32!BaseThreadInitThunk+0xd
1e 00000000`00ebfc60 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

This thread called into spooler to end a document based on yet another RPC!  To be clear, this is what we’re looking at so far:

Thread Chain
Thread Chain

Again, who called into this thread?

0:003> .frame /c /r 19
19 00000000`00ebf6d0 000007fe`fe302a35 rpcrt4!LRPC_ADDRESS::ProcessIO+0x3bf
rax=57633d3cd2c9c145 rbx=0000000000000000 rcx=0000000000861ff0
rdx=ffffffffff88ffe0 rsi=00000000002b8aa0 rdi=0000000000ec6bc0
rip=000007fefe302e7f rsp=0000000000ebf6d0 rbp=00000000002cdef0
 r8=000000000003dbf0  r9=00000000000000fe r10=6dc0575d3d3cc4c8
r11=0000000000860020 r12=0000000000ec82b0 r13=0000000000000000
r14=0000000000000000 r15=0000000000ec8080
iopl=0         nv up ei pl zr na po nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246
000007fe`fe302e7f 4c8d357ad1fbff  lea     r14,[rpcrt4!COMMON_ResubmitListen <PERF> (rpcrt4+0x0) (000007fe`fe2c0000)]
0:003> dt _CLIENT_ID rdi+8
   +0x000 UniqueProcess    : 0x00000000`00002694 Void
   +0x008 UniqueThread     : 0x00000000`000014cc Void

So another call from spooler (note the same PID as earlier) – let’s go back to spoolsv.dmp.  Here is this thread:

0:048> ~~[14cc]s
00000000`77801b6a c3              ret
0:049> k
Child-SP          RetAddr           Call Site
00000000`0351e538 000007fe`fe2fa776 ntdll!NtAlpcSendWaitReceivePort+0xa
00000000`0351e540 000007fe`fe2f4e42 rpcrt4!LRPC_CCALL::SendReceive+0x156
00000000`0351e600 000007fe`fdd528c0 rpcrt4!I_RpcSendReceive+0x42
00000000`0351e630 000007fe`fdd5282f ole32!ThreadSendReceive+0x40
00000000`0351e680 000007fe`fdd5265b ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+0xa3
00000000`0351e720 000007fe`fdc0daaa ole32!CRpcChannelBuffer::SendReceive2+0x11b
00000000`0351e8e0 000007fe`fdc0da0c ole32!CAptRpcChnl::SendReceive+0x52
00000000`0351e9b0 000007fe`fdd5205d ole32!CCtxComChnl::SendReceive+0x68
00000000`0351ea60 000007fe`fe39b949 ole32!NdrExtpProxySendReceive+0x45
00000000`0351ea90 000007fe`fdd521d0 rpcrt4!NdrpClientCall3+0x2e2
00000000`0351ed50 000007fe`fdc0d8a2 ole32!ObjectStublessClient+0x11d
00000000`0351f0e0 000007fe`fb0591ac ole32!ObjectStubless+0x42
00000000`0351f130 000007fe`fb057882 localspl!sandbox::PrintProcessorExecuteObserver::PrintDocThroughPrintProcessor+0x124
00000000`0351f1f0 000007fe`fb05601d localspl!sandbox::PrintProcessorAdapterImpl::PrintDocumentOnPrintProcessor+0x3a
00000000`0351f220 000007fe`fb013b70 localspl!sandbox::PrintProcessorAdapter::PrintDocumentOnPrintProcessor+0x9d
00000000`0351f270 000007fe`fb014c7c localspl!PrintDocumentThruPrintProcessor+0x46c
00000000`0351fa70 00000000`776a652d localspl!PortThread+0x4d0
00000000`0351fd80 00000000`777dc521 kernel32!BaseThreadInitThunk+0xd
00000000`0351fdb0 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

Perfect.  So now we know that localspl was printing a document, which resulted in all of these RPC calls between spooler and Print Isolation Host, and ultimately the deadlock in Print Isolation Host is holding up this thread.  Just out of curiosity, are there any other threads blocked on this wait chain?

0:049> !locks
CritSec +5a00060 at 0000000005a00060
WaiterWoken        No
LockCount          105
RecursionCount     1
OwningThread       5e8
EntryCount         0
ContentionCount    84
*** Locked

Scanned 14148 critical sections

Yes there are.  Thread 5e8, which we looked at earlier, is holding a lock that 104 other threads are waiting for!  This dump had 177 threads, so we know now that thread 5e8, 14cc, and 104 others in spooler are all hung on this deadlock. With about 60% of all the threads in spooler hung, the deadlock in ProseWarePrintProcessor is clearly the cause of our issue.  Here’s the final wait chain diagram:


To resolve the issue, ProseWarePrintProcessor needs to avoid calling GetModuleFileName while its DllMain routine is still running, since the former requires, and the latter holds, the Loader Lock.

Test Mode, and other desktop watermarks

This is an archive post of content I wrote for the NTDebugging Blog on MSDN. Since the MSDN blogs are being retired, I’m transferring my posts here so they aren’t lost. The post has been back-dated to its original publication date.

Hi all, Matt here again.  One of our team’s main functions is to work with our development teams to create hotfixes when customers run into issues that can only be resolved through a code change.  The developers will often prepare a private hotfix that either tests the proposed change, or adds additional instrumentation to help pinpoint the issue. The private hotfix is sent to the customer reporting the problem so they can confirm that it does indeed correct (or identify) the flaw. 

When testing a private hotfix, customers frequently ask, why does my desktop now show a message on the lower right corner of the desktop, and what does it mean?  The message reads “For testing purposes only”, and looks like this:

For testing purposes only Watermark
For testing purposes only Watermark

Often, users are concerned that his message means that they aren’t allowed to use the server in production, or maybe that it is now “unsupported.”  These aren’t the case!  Since this message appears as a result of installing a fix during the course of a Microsoft Support case, the servers are, by definition, being supported. 

The purpose of this message is simply to remind users that code that Microsoft Support has asked them to test has been installed on the system, and this code may not have yet undergone the full suite of quality assurance testing that fixes that are made public do.    

For comparison, let’s look at some of the other watermarks you may find in the lower corner of the desktop – as these can often be confused for the above message, and may explain some of the fears around these messages. 

Safe Mode Watermark
Safe Mode Watermark

First up is the old trusty text you see when a box is booted into ‘Safe Mode’. I’m sure every IT Pro has seen this at one time or another, so I won’t go into detail, but rest assured, the testing purposes text is completely unrelated to booting in safe mode or having a subset of services running.

Evaluation copy Watermark
Evaluation copy Watermark

Next up is our ‘Evaluation copy’ watermark.  This message is shown on the desktops of copies of Windows that have a “time bomb” (ones that will cease to function after a certain date.)  This message is typically seen on beta versions of Windows which are designed to stop functioning sometime after the desired beta testing period ends. 

This copy of Windows is not genuine Watermark
This copy of Windows is not genuine Watermark

Third, we have our Windows is not genuine message.  This is shown if, for example, a copy of Windows is not activated during the grace period after the installation process, or if a number of hardware changes have been made and Windows needs to be re-activated.  This has nothing to do with the ‘testing purposes’ message.  See http://technet.microsoft.com/en-us/library/dd979803.aspx for more information about this message.

Build Number Watermark
Build Number Watermark

Fourth, we have the general Windows build stamp.  This is enabled via the registry using the PaintDesktopVersion DWORD (http://technet.microsoft.com/en-us/library/cc782257(WS.10).aspx).  Some administrators like to enable this option so they always know what version of Windows they are using, sort of like a mini-bginfo.  Unlike all of the others, this message does not indicate anything else about a server’s state.

Test Mode Watermark
Test Mode Watermark

And finally, we have ‘Test Mode’.  This is actually somewhat related to the testing purposes message.  This ‘Test Mode’ text is shown when test signing is enabled on a PC.  This is done by running “bcdedit /set testsigning on” from an UAC-elevated command prompt.  Test signing is used to allow developers to load drivers they are still working on that have not yet been code signed with an official certificate.  This is actually one of the steps we need to do when loading our test fixes.  For more information on Test Signing, see http://msdn.microsoft.com/en-us/library/ff553484%28v=VS.85%29.aspx.

So now that you know what causes these various watermarks to appear, perhaps you’re wondering how to make the “For testing purposes only” message disappear.  This is a question we are frequently asked.  While you are running a test hotfix, there is no way to disable this message.  Your only option is to remove the test hotfix from your system.  This is something your engineer will ask you to do before you install the final, public version of a hotfix.  You can easily identify and uninstall test hotfixes by going into Control Panel, Programs and Features, View Installed Updates, then look for hotfixes with the words FOR TESTING PURPOSES ONLY in their name, like the one shown in the image below.  You may also notice that the KB number listed for these fixes is often a place holder, and not a real KB article ID.

Installed Updates Control Panel
Installed Updates Control Panel

If the ‘For testing purposes only’ message is still displayed even after uninstalling the test hotfix, there is one more place to check.  If a system has the Microsoft Test Root Authority certificate installed into its Trusted Root Certification Authorities store, the text will be displayed.  We use this certificate to allow a PC to run test code that has been signed by our development team, but not yet fully tested and signed off with the official Microsoft certificate.  To remove this certificate from your system, go to Start -> Run, and enter certmgr.msc and hit enter.  In the Certificate Manager MMC, browse to Trusted Root Certification Authorities, then into Certificates.  You should see one labeled Microsoft Test Root Authority, as shown below.  This will need to be deleted and the system rebooted to remove the testing purposes message.  Don’t do this if you still have a test hotfix installed though, as it would prevent that binary from continuing to function and may mean you can no longer boot in to Normal or Safe mode.

certmgr Trusted Root Certs
certmgr Trusted Root Certs

If you reboot and find that ‘Text Mode’ has replaced the ‘For testing purposes only’  text, you’ll need to launch a command prompt with administrative privileges, then run “bcdedit /set testsigning off” and reboot.  You can always check if test signing is enabled by running “bcdedit /enum” and looking for this line:

bcdedit showing testsigning setting
bcdedit showing testsigning setting

That’s all for today.  Hopefully this post helped clear up any confusion about our different desktop watermarks.