This series of posts will cover the details of reverse engineering the AddVectoredExceptionHandler function, a Windows API function responsible for registering a special type of exception handler at runtime. The series will be split in to three parts: first identifying key structures that are used, second understanding the implementation, and lastly re-implementing the reverse engineered assembly to working C code. This reverse engineered implementation will behave identically with the original function, and presumably under the same compiler options, would produce a very close assembly listing. The reverse engineering was done on Windows 7, so there will be slight differences in assembly listings if you are following along on a different version. The re-implementation code (part 3) was tested on Windows 7 and 8.1 on x86 and x64, so the high-level details should not change.
Starting out
The goal is to see how AddVectoredExceptionHandler works. This means tracing it through from an example program over in to kernel32.dll, where the implementation resides. Naturally, the best way to go about doing this is with a debugger. The Visual Studio debugger will be the debugger of choice for this series since we’ll be debugging our own code.
Stepping in to the disassembly shows that AddVectoredExceptionHandler calls _RtlAddVectoredExceptionHandler, which in turn is a wrapper for _RtlpAddVectoredHandler. The assembly listing for _RtlAddVectoredExceptionHandler is shown below:
_RtlAddVectoredExceptionHandler@8: 771F742B mov edi,edi 771F742D push ebp 771F742E mov ebp,esp 771F7430 push 0 771F7432 push dword ptr [ebp+0Ch] 771F7435 push dword ptr [ebp+8] 771F7438 call _RtlpAddVectoredHandler@12 (771E3621h) 771F743D pop ebp 771F743E ret 8 |
This code simply pushes an extra constant parameter and invokes _RtlpAddVectoredHandler(FirstHandler, VectoredHandler, 0). The actual details reside in _RtlpAddVectoredHandler, reproduced in its entirety, below:
_RtlpAddVectoredHandler@12: 771E3621 mov edi,edi 771E3623 push ebp 771E3624 mov ebp,esp 771E3626 mov eax,dword ptr fs:[00000018h] 771E362C mov eax,dword ptr [eax+30h] 771E362F push esi 771E3630 push 10h 771E3632 push 0 771E3634 push dword ptr [eax+18h] 771E3637 call _RtlAllocateHeap@12 (771AE026h) 771E363C mov esi,eax 771E363E test esi,esi 771E3640 je _RtlpAddVectoredHandler@12+83h (771E36A4h) 771E3642 push ebx 771E3643 push edi 771E3644 push dword ptr [ebp+0Ch] 771E3647 mov dword ptr [esi+8],1 771E364E call _RtlEncodePointer@4 (771C0FCBh) 771E3653 mov ebx,dword ptr [ebp+10h] 771E3656 imul ebx,ebx,0Ch 771E3659 add ebx,77284724h 771E365F push ebx 771E3660 mov dword ptr [esi+0Ch],eax 771E3663 lea edi,[ebx+4] 771E3666 call _RtlAcquireSRWLockExclusive@4 (771B29F1h) 771E366B cmp dword ptr [edi],edi 771E366D jne _RtlpAddVectoredHandler@12+65h (771E3686h) 771E366F mov ecx,dword ptr fs:[18h] 771E3676 mov eax,dword ptr [ebp+10h] 771E3679 mov ecx,dword ptr [ecx+30h] 771E367C add eax,2 771E367F add ecx,28h 771E3682 lock bts dword ptr [ecx],eax 771E3686 cmp dword ptr [ebp+8],0 771E368A je _RtlpAddVectoredHandler@12+13DF3h (771F7414h) ----> Jump resolved below 771F7414 mov eax,dword ptr [edi+4] 771F7417 mov dword ptr [esi],edi 771F7419 mov dword ptr [esi+4],eax 771F741C mov dword ptr [eax],esi 771F741E mov dword ptr [edi+4],esi 771F7421 jmp _RtlpAddVectoredHandler@12+7Bh (771E369Ch) 771E3690 mov eax,dword ptr [edi] 771E3692 mov dword ptr [esi],eax 771E3694 mov dword ptr [esi+4],edi 771E3697 mov dword ptr [eax+4],esi 771E369A mov dword ptr [edi],esi 771E369C push ebx 771E369D call _RtlReleaseSRWLockExclusive@4 (771B29ABh) 771E36A2 pop edi 771E36A3 pop ebx 771E36A4 mov eax,esi 771E36A6 pop esi 771E36A7 pop ebp 771E36A8 ret 0Ch
Decoding the Assembly
Don’t mind the gratuitous highlighting above; it is there to highlight individual pieces of the function and make it more manageable to understand. The function begins by performing a call to RtlAllocateHeap, highlighted in orange. The three parameters provided are [EAX+18], 0, and 16 (0x10). EAX is initially loaded with the address of the TIB structure (light pink). From this structure, the PEB structure is then retrieved. The member at [PEB+0x18], which is documented as ProcessHeap is then given to RtlAllocateHeap. Everything here seems to make sense so far.
Next, in green, comes a call to RtlEncodePointer, which is the implementation of EncodePointer. The address of the vectored handler, at [EBP+0xC] is given as the argument here. This function, as its name implies, is responsible for encoding the provided pointer. It does this by performing an XOR with a cookie value generated at runtime.
From earlier, it should be noticed that the requested allocation size provided to RtlAllocateHeap was 16 bytes (0x10). The next few instructions give some information about how this returned memory is accessed. The instructions in black move two values into this memory region, one at 0x8 and one at 0xC. Given this information, it is safe to assume that what is being allocated is a 16 byte struct. The third field at +0x8 is always set to 1 in this function, and the fourth at +0xC is set to hold the encoded handler address. It’s possible to write out a basic definition for this struct at this point:
struct MysteryStruct { DWORD dwUnknown1; +0x0 DWORD dwUnknown2; +0x4 DWORD dwAlwaysOne; +0x8 PVECTORED_EXCEPTION_HANDLER pVectoredHandler; +0xC }; |
This definition will be revisited and completed later.
The next block of code, in teal, performs some arithmetic operations. It loads [EBP+0x10], which was shown to be always 0 (from _RtlAddVectoredExceptionHandler) into EBX. This value is multipled by 12 (0xC), which still yields a zero. Then the value 0x77284724 is added to it. Checking what resides at this address in a debugger shows something interesting:
_LdrpVectorHandlerList: 77284724 01 00 add dword ptr [eax],eax 77284726 00 00 add byte ptr [eax],al 77284728 28 47 28 sub byte ptr [edi+28h],al 7728472B 77 28 ja _RtlpProcessHeapsListBuffer+15h (77284755h) 7728472D 47 inc edi 7728472E 28 77 00 sub byte ptr [edi],dh ... |
It turns out that 0x77284724 is the address of the symbol _LdrpVectorHandlerList. The non-sense assembly instructions there are simply mnemonic representations of _LdrpVectorHandlerList‘s data members. The base of this structure is used as an argument for _RtlAcquireSRWLockExclusive, which is the implementation of AcquireSRWLockExclusive. This function takes a PSRWLOCK argument. Given this, it is immediately possible to deduce that the first member of _LdrpVectorHandlerList is an SRWLOCK structure. More about this structure will be revealed later.
The code in bright pink begins by loading the second field in the _LdrpVectorHandlerList structure in to EDI. This value is then dereferenced and compared to its own address — basically a check if a pointer is pointing to itself. If that is the case then the rest of the pink block will be executed. The code once again retrieves the PEB structure similar to light pink. Expect this time, [PEB+0x28] will be the value that ends up being used. Additionally, it loads [EBP+0x10] (always 0) into EAX, and adds 2 to it. There it an atomic bit test and set instruction that is carried out between [PEB+0x28] and 2. [PEB+0x28] has been documented as “CrossProcessFlags” and is a bit of a mystery in the context of this function.
Lastly, the block in red is where the actual interesting code happens. It begins by checking to see if the first parameter to the function, the flag saying whether the handler is to be the first or last in the chain, is zero. In either case, there are a lot of pointers moving around from looking at the instructions. One would guess that from implementing an exception handler list that there would be pointers to next/previous nodes. Lets begin investigating the case where an exception handler will be added to front of the chain (FirstHandler parameter does not equal 0). Starting at 0x771E3690, [EDI] is moved into EAX. From earlier, [EDI] holds the second member of the _LdrpVectorHandlerList structure. This is then moved in to [ESI], which is the first member of the structure allocated with RtlAllocateHeap (MysteryStruct above). Then EDI (not dereferenced) is moved in to [ESI+0x4].
This completes finding references to the allocated structure. RtlAllocateHeap had a request for 16 bytes, and 16 bytes have now been used/written to. ESI is then moved in to [EAX+4] and [EDI], which relate to two pointers in _LdrpVectorHandlerList. The part where the handler is added to the back over the list won’t be covered in this post, since it’s basically the same thing except for which pointers get rearranged.
Finalizing Structure Definitions
Going through the code revealed two main structures at work here. There is the 16 byte structure that was allocated in the beginning and the _LdrpVectorHandlerList structure. The MysteryStruct from earlier can be better defined now. I’ve renamed it as _LdrpVectorHandlerEntry to be consistent with the known _LdrpVectorHandlerList symbol.
typedef struct _LdrpVectorHandlerEntry { _LdrpVectorHandlerEntry *pLink1; +0x0 _LdrpVectorHandlerEntry *pLink2; +0x4 DWORD dwAlwaysOne; +0x8 PVECTORED_EXCEPTION_HANDLER pVectoredHandler; +0xC } VECTORED_HANDLER_ENTRY, *PVECTORED_HANDLER_ENTRY; |
Also, from studying the pointer swapping operations between the new entry and the list, it is possible to define _LdrpVectorHandlerList a bit more clearly as well:
typedef struct _LdrpVectorHandlerList { SRWLOCK srwLock; +0x0 VECTORED_HANDLER_ENTRY *pLink1; +0x4 VECTORED_HANDLER_ENTRY *pLink2; +0x8 } VECTORED_HANDLER_LIST, *PVECTORED_HANDLER_LIST; +0xC |
The types in these structures have been defined. The next part of this series will cover how the links behave. Follow on Twitter for more updates.