RCE Endeavors 😅

April 30, 2014

Messing with MSN Internet Games (1/2)

Filed under: Game Hacking,General x86-64,Reverse Engineering — admin @ 12:08 AM

This post will entail the fun endeavors of reverse engineering the default MSN Internet Games that come with most “Professional” and higher versions of Windows (although discontinued from Windows 8 onwards). Namely the common protocol shared by Internet Backgammon, Internet Checkers, and Internet Spades.

backgammonUpon launching the game and connecting with another player, the first thing to do is to check what port everything is running on. In this case, it was port 443, which is the port most commonly used for SSL. This has the advantage of giving away a known protocol, but the disadvantage of not being able to read/modify any of the outgoing data. It can also mean that there is a custom protocol that is encrypted and has an SSL layer added on top before going out, but fortunately that is not the case here (spoilers).

ipStarting Point

Since SSL consists of part of the network code, the most logical place to start is in those respective modules which carry out the work: ncrypt.dll and bcrypt.dll. The prime target here is the SslEncryptPacket function. Presumably, this function will be called somewhere in the chain leading up in to the packet leaving the client. Per MSDN, two of the parameters for the function are:

pbInput [in]

    A pointer to the buffer that contains the packet to be encrypted.
cbInput [in]

    The length, in bytes, of the pbInput buffer.

If we can intercept the function call and inspect those parameters, there is a chance of being able to view the data that is leaving the client. If not, then inspecting further down the call stack will eventually lead to the plaintext anyway. There is also a corresponding SslDecryptPacket function which will serve as a starting point to getting and inspecting server responses.

The Plan

The plan of action is pretty straightforward.

  • Get into the address space of the target executable. This will be done through a simple DLL injection.
  • Find the target function for encrypting data (SslEncryptPacket) and decrypting data (follow call from SslDecryptPacket down).
  • Install hooks on these two functions. The chosen method will be through memory breakpoints.
  • Inspect the contents of incoming and outgoing messages in plaintext. Determine the protocol and begin messing with it.

The first step won’t be covered here due to the hundreds of different DLL injection tutorials/guides/tools already out there. The code in the injected DLL will be a pretty direct translation of the above steps. Something akin to the code below:

int APIENTRY DllMain(HMODULE hModule, DWORD dwReason, LPVOID lpReserved)
{
    switch(dwReason)
    {
    case DLL_PROCESS_ATTACH:
        (void)DisableThreadLibraryCalls(hModule);
        if(AllocConsole())
        {
            freopen("CONOUT$", "w", stdout);
            SetConsoleTitle(L"Console");
            SetConsoleTextAttribute(GetStdHandle(STD_OUTPUT_HANDLE), FOREGROUND_RED | FOREGROUND_GREEN | FOREGROUND_BLUE);
            printf("DLL loaded.\n");
        }
        if(GetFunctions())
        {
            pExceptionHandler = AddVectoredExceptionHandler(TRUE, VectoredHandler);
            if(SetBreakpoints())
            {
                printf("BCryptHashData: %016X\n"
                    "SslEncryptPacket: %016X\n",
                    BCryptHashDataFnc, SslEncryptPacketFnc);
            }
            else
            {
                printf("Could not set initial breakpoints.\n");
            }
        }
        break;
 
    case DLL_PROCESS_DETACH:
        //Clean up here usually
        break;
 
    case DLL_THREAD_ATTACH:
        break;
 
    case DLL_THREAD_DETACH:
        break;
    }
 
    return TRUE;
}

A “debug console” instance is created to save effort on having to attach a debugger in each testing instance. Pointers to the desired functions are then retrieved through the GetFunctions() function, and lastly memory breakpoints are installed on the two functions (encryption/decryption) to monitor the data being passed to them. For those wondering where BCryptHashData came from, it was traced down from SslDecryptData. It is actually called on both encryption/decryption, but will serve as the point of monitoring received messages from the server (in this post at least).

The second step is very easy and straightforward. By injecting a DLL into the process, we have full access to the process address space, and it is a simple matter of calling GetProcAddress on the desired target functions. This becomes basic WinAPI knowledge.

FARPROC WINAPI GetExport(const HMODULE hModule, const char *pName)
{
    FARPROC pRetProc = (FARPROC)GetProcAddress(hModule, pName);
    if(pRetProc == NULL)
    {
        printf("Could not get address of %s. Last error = %X\n", pName, GetLastError());
    }
 
    return pRetProc;
}
 
const bool GetFunctions(void)
{
    HMODULE hBCryptDll = GetModuleHandle(L"bcrypt.dll");
    HMODULE hNCryptDll = GetModuleHandle(L"ncrypt.dll");
    if(hBCryptDll == NULL)
    {
        printf("Could not get handle to Bcrypt.dll. Last error = %X\n", GetLastError());
        return false;
    }
    if(hNCryptDll == NULL)
    {
        printf("Could not get handle to Bcrypt.dll. Last error = %X\n", GetLastError());
        return false;
    }
    printf("Module handle: %016X\n", hBCryptDll);
 
    BCryptHashDataFnc = (pBCryptHashData)GetExport(hBCryptDll, "BCryptHashData");
    SslEncryptPacketFnc = (pSslEncryptPacket)GetExport(hNCryptDll, "SslEncryptPacket");
 
    return ((BCryptHashDataFnc != NULL) && (SslEncryptPacketFnc != NULL));
}

Installing the hooks (via memory breakpoints) is just an adaptation of the previous post on it. The code looks as follows:

const bool AddBreakpoint(void *pAddress)
{
    SIZE_T dwSuccess = 0;
 
    MEMORY_BASIC_INFORMATION memInfo = { 0 };
    dwSuccess = VirtualQuery(pAddress, &memInfo, sizeof(MEMORY_BASIC_INFORMATION));
    if(dwSuccess == 0)
    {
        printf("VirtualQuery failed on %016X. Last error = %X\n", pAddress, GetLastError());
        return false;
    }
 
    DWORD dwOldProtections = 0;
    dwSuccess = VirtualProtect(pAddress, sizeof(DWORD_PTR), memInfo.Protect | PAGE_GUARD, &dwOldProtections);
    if(dwSuccess == 0)
    {
        printf("VirtualProtect failed on %016X. Last error = %X\n", pAddress, GetLastError());
        return false;
    }
 
    return true;
}
 
const bool SetBreakpoints(void)
{
    bool bRet = AddBreakpoint(BCryptHashDataFnc);
    bRet &= AddBreakpoint(SslEncryptPacketFnc);
 
    return bRet;
}
 
LONG CALLBACK VectoredHandler(EXCEPTION_POINTERS *pExceptionInfo)
{
    if(pExceptionInfo->ExceptionRecord->ExceptionCode == STATUS_GUARD_PAGE_VIOLATION)
    {        
        pExceptionInfo->ContextRecord->EFlags |= 0x100;
 
        DWORD_PTR dwExceptionAddress = (DWORD_PTR)pExceptionInfo->ExceptionRecord->ExceptionAddress;
        CONTEXT *pContext = pExceptionInfo->ContextRecord;
 
        if(dwExceptionAddress == (DWORD_PTR)SslEncryptPacketFnc)
        {
            DWORD_PTR *pdwParametersBase = (DWORD_PTR *)(pContext->Rsp + 0x28);
            SslEncryptPacketHook((NCRYPT_PROV_HANDLE)pContext->Rcx, (NCRYPT_KEY_HANDLE)pContext->Rdx, (PBYTE *)pContext->R8, (DWORD)pContext->R9,
                (PBYTE)(*(pdwParametersBase)), (DWORD)(*(pdwParametersBase + 1)), (DWORD *)(*(pdwParametersBase + 2)), (ULONGLONG)(*(pdwParametersBase + 3)),
                (DWORD)(*(pdwParametersBase + 4)), (DWORD)(*(pdwParametersBase + 5)));
        }
        else if(dwExceptionAddress == (DWORD_PTR)BCryptHashDataFnc)
        {
            BCryptHashDataHook((BCRYPT_HASH_HANDLE)pContext->Rcx, (PUCHAR)pContext->Rdx, (ULONG)pContext->R8, (ULONG)pContext->R9);
        }
 
        return EXCEPTION_CONTINUE_EXECUTION;
    }
 
    if(pExceptionInfo->ExceptionRecord->ExceptionCode == STATUS_SINGLE_STEP)
    {
        (void)SetBreakpoints();
        return EXCEPTION_CONTINUE_EXECUTION;
    }
    return EXCEPTION_CONTINUE_SEARCH;
}

checkersSoftware breakpoints will be set on the memory page that SslEncryptPacket and BCryptHashData are on. When these are hit a STATUS_GUARD_PAGE_VIOLATION will be raised and caught by the topmost vectored exception handler that the injected DLL installed upon load. The exception address will be checked against the two desired target addresses (SslEncryptPacket/BCryptHashData) and an inspection function will be called. In this case it will just echo the contents of the plaintext data buffers out to the debug console instance.  The single-step flag will be set so the program can continue execution by one instruction before raising a STATUS_SINGLE_STEP exception, upon which the memory breakpoints will be reinstalled (since guard page flags are cleared after the page gets accessed). For a more in-depth explanation, see the linked post related to memory breakpoints posted before on this blog.

The x64 ABI (on Windows) stores the first four parameters in RCX, RDX, R8, and R9 respectively, and the rest on the stack. There is no need to worry about locating any extra parameters in the case of BCryptHashData, which only takes four. However, SslEncryptData takes ten parameters, so there are another six to locate. In this case, there is no reason to care beyond the fourth parameter, but all of them are passed in for the sake of completeness. The base of the parameters on the stack were found by looking at how the function is called and verifying with a debugger during runtime.

The “hook” code, as mentioned above, will just print out the data buffers. The implementation is given below:

void WINAPI BCryptHashDataHook(BCRYPT_HASH_HANDLE hHash, PUCHAR pbInput, ULONG cbInput, ULONG dwFlags)
{
    printf("--- BCryptHashData ---\n"
        "Input: %.*s\n",
        cbInput, pbInput);
}
 
void WINAPI SslEncryptPacketHook(NCRYPT_PROV_HANDLE hSslProvider, NCRYPT_KEY_HANDLE hKey, PBYTE *pbInput, DWORD cbInput,
                              PBYTE pbOutput, DWORD cbOutput, DWORD *pcbResult, ULONGLONG SequenceNumber, DWORD dwContentType, DWORD dwFlags)
{
    printf("--- SslEncryptPacket ---\n"
        "Input: %.*s\n",
        cbInput, pbInput);
}

What Does It Look Like?

After everything is completed, it is time to inspect the protocol. Below are some selected packet logs from a session of Checkers.

STATE {some large uuid}
Length: 0x000003CD
 
<?xml version="1.0"?>
<StateMessage xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="h
ttp://www.w3.org/2001/XMLSchema" xsi:type="StateMessageEx" xmlns="http://zone.ms
n.com/stadium/wincheckers/">
  <nSeq>4</nSeq>
  <nRole>0</nRole>
  <eStatus>Ready</eStatus>
  <nTimestamp>578</nTimestamp>
  <sMode>normal</sMode>
  <arTags>
    <Tag>
      <id>chatbyid</id>
      <oValue xsi:type="ChatTag">
        <UserID>numeric user id</UserID>
        <Nickname>numeric nickname</Nickname>
        <Text>SYSTEM_ENTER</Text>
        <FontFace>MS Shell Dlg</FontFace>
        <FontFlags>0</FontFlags>
        <FontColor>255</FontColor>
        <FontCharSet>1</FontCharSet>
        <MessageFlags>2</MessageFlags>
      </oValue>
    </Tag>
    <Tag>
      <id>STag</id>
      <oValue xsi:type="STag">
        <MsgID>StartCountDownTimer</MsgID>
        <MsgIDSbKy />
        <MsgD>0</MsgD>
      </oValue>
    </Tag>
  </arTags>
</StateMessage>
 
STATE {some large uuid}
Length: 0x000006D1
 
<?xml version="1.0"?>
<StateMessage xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="h
ttp://www.w3.org/2001/XMLSchema" xsi:type="StateMessageEx" xmlns="http://zone.ms
n.com/stadium/wincheckers/">
  <nSeq>5</nSeq>
  <nRole>0</nRole>
  <eStatus>Ready</eStatus>
  <nTimestamp>2234</nTimestamp>
  <sMode>normal</sMode>
  <arTags>
    <Tag>
      <id>STag</id>
      <oValue xsi:type="STag">
        <MsgID>FrameworkUpdate</MsgID>
        <MsgIDSbKy />
        <MsgD>&lt;D&gt;&lt;StgSet&gt;&lt;SeatCnt&gt;2&lt;/SeatCnt&gt;&lt;GameT&g
t;AUTOMATCH&lt;/GameT&gt;&lt;AILvls&gt;2&lt;/AILvls&gt;&lt;GameM&gt;INIT_GAME&lt
;/GameM&gt;&lt;Start&gt;True&lt;/Start&gt;&lt;PMatch&gt;False&lt;/PMatch&gt;&lt;
ShowTeam&gt;False&lt;/ShowTeam&gt;&lt;/StgSet&gt;&lt;/D&gt;</MsgD>
      </oValue>
    </Tag>
    <Tag>
      <id>STag</id>
      <oValue xsi:type="STag">
        <MsgID>GameInit</MsgID>
        <MsgIDSbKy>GameInit</MsgIDSbKy>
        <MsgD>&lt;GameInit&gt;&lt;Role&gt;0&lt;/Role&gt;&lt;Players&gt;&lt;Playe
r&gt;&lt;Role&gt;0&lt;/Role&gt;&lt;Name&gt;8201314a      01&lt;/Name&gt;&lt;Type
&gt;Human&lt;/Type&gt;&lt;/Player&gt;&lt;Player&gt;&lt;Role&gt;1&lt;/Role&gt;&lt
;Name&gt;1d220e29      01&lt;/Name&gt;&lt;Type&gt;Human&lt;/Type&gt;&lt;/Player&
gt;&lt;/Players&gt;&lt;Board&gt;&lt;Row&gt;0,1,0,1,0,1,0,1&lt;/Row&gt;&lt;Row&gt
;1,0,1,0,1,0,1,0&lt;/Row&gt;&lt;Row&gt;0,1,0,1,0,1,0,1&lt;/Row&gt;&lt;Row&gt;0,0
,0,0,0,0,0,0&lt;/Row&gt;&lt;Row&gt;0,0,0,0,0,0,0,0&lt;/Row&gt;&lt;Row&gt;3,0,3,0
,3,0,3,0&lt;/Row&gt;&lt;Row&gt;0,3,0,3,0,3,0,3&lt;/Row&gt;&lt;Row&gt;3,0,3,0,3,0
,3,0&lt;/Row&gt;&lt;/Board&gt;&lt;GameType&gt;Standard&lt;/GameType&gt;&lt;/Game
Init&gt;</MsgD>
      </oValue>
    </Tag>
  </arTags>
</StateMessage>
 
CALL EventSend messageID=EventSend&XMLDataString=%3CMessage%3E%3CMove%3E%
3CSource%3E%3CX%3E6%3C/X%3E%3CY%3E5%3C/Y%3E%3C/Source%3E%3CTarget%3E%3CX%3E7%3C/
X%3E%3CY%3E4%3C/Y%3E%3C/Target%3E%3C/Move%3E%3C/Message%3E
 
CALL EventSend messageID=EventSend&XMLDataString=%3CMessage%3E%3CGameMana
gement%3E%3CMethod%3EResignGiven%3C/Method%3E%3C/GameManagement%3E%3C/Message%3E

The protocol basically screams XML-RPC. It appears that the entire state of the game is initialized and carried out over these XML messages. From a security perspective, it also presents an interesting target to fuzz, given the large variety of fields present within these messages, and the presence of a length field in the message.

Some Issues With This Approach

There are some issues with this approach. Firstly, ncrypt.dll and bcrypt.dll are delay loaded, so our DLL will have to be injected after a multiplayer session starts, or there will have to be some polling loop introduced to check whether these two DLLs have loaded. This is ugly and there is a much better way to get around this that will be talked about in the next post. Secondly, BCryptHashData is used for both incoming and outgoing messages. This makes it more difficult if we wish to mess with these messages as there will have to be logic added to distinguish between client and server messages. This will also be resolved in the next post.

The full source code relating to this can be found here.

December 6, 2013

Calling Undocumented APIs in the Windows Kernel

Filed under: General x86,General x86-64,Reverse Engineering — admin @ 7:51 PM

Background

This post takes a different approach from the others and delves into the world of the Windows kernel. Specifically, it will cover how to access the undocumented APIs that are present within the kernel (ntoskrnl). If you trace a Windows API call from usermode to the kernel, you will find the endpoint to be something similar to what is shown below (Win 8 x64):

public NtOpenFile
NtOpenFile proc near
4C 8B D1                           mov r10, rcx
B8 31 00 00 00                     mov eax, 31h
0F 05                              syscall
C3                                 retn
NtOpenFile endp

where the r10 register holds the value of the first argument and eax holds the index into the Windows internal syscall table. A note should be made that this is specific to a x64 operating system running a native x64 application. x86 systems rely on going through KiFastSystemCall in ntdll to achieve invoking a syscall, and WOW64 emulation relies on making transitions from x64 to x86 and back and setting up an appropriate stack in-between. When the syscall instruction executes, the flow of code will eventually find itself to NtOpenFile in ntoskrnl. This is actually a wrapper around IopCreateFile (shown below):

public NtOpenFile
NtOpenFile proc near
4C 8B DC                            mov     r11, rsp
48 81 EC 88 00 00 00                sub     rsp, 88h
8B 84 24 B8 00 00 00                mov     eax, [rsp+88h+arg_28]
45 33 D2                            xor     r10d, r10d
4D 89 53 F0                         mov     [r11-10h], r10
C7 44 24 70 20 00 00 00             mov     [rsp+88h+var_18], 20h
45 89 53 E0                         mov     [r11-20h], r10d
4D 89 53 D8                         mov     [r11-28h], r10
45 89 53 D0                         mov     [r11-30h], r10d
45 89 53 C8                         mov     [r11-38h], r10d
4D 89 53 C0                         mov     [r11-40h], r10
89 44 24 40                         mov     [rsp+88h+var_48], eax
8B 84 24 B0 00 00 00                mov     eax, [rsp+88h+arg_20]
C7 44 24 38 01 00 00 00             mov     [rsp+88h+var_50], 1
89 44 24 30                         mov     [rsp+88h+var_58], eax
45 89 53 A0                         mov     [r11-60h], r10d
4D 89 53 98                         mov     [r11-68h], r10
E8 48 E2 FC FF                      call    IopCreateFile
48 81 C4 88 00 00 00                add     rsp, 88h
C3                                  retn
NtOpenFile endp

Again it should be noted that there was a lot of hand-waving going on here, and that the syscall instruction does not simply invoke the native kernel API, but goes through several routines responsible for setting up trap frames and performing access checks before arriving at the native API implementation.
Exported native kernel APIs for use in drivers also follow a similar, but nowhere near as complex mechanism. Every Zw* function in the kernel provides a thin wrapper around a call to the Nt* version (example shown below):

NTSTATUS __stdcall ZwOpenFile(PHANDLE FileHandle,
    ACCESS_MASK DesiredAccess,
    POBJECT_ATTRIBUTES ObjectAttributes,
    PIO_STATUS_BLOCK IoStatusBlock,
    ULONG ShareAccess,
    ULONG OpenOptions)
ZwOpenFile proc near
48 8B C4                            mov     rax, rsp
FA                                  cli
48 83 EC 10                         sub     rsp, 10h
50                                  push    rax
9C                                  pushfq
6A 10                               push    10h
48 8D 05 BD 2F 00 00                lea     rax, KiServiceLinkage
50                                  push    rax
B8 31 00 00 00                      mov     eax, 31h
E9 C2 DA FF FF                      jmp KiServiceInternal
ZwOpenFile endp

This wrapper does basic things such as set up the stack, disable kernel interrupts (cli), and preserve flags. The KiServiceLinkage function is just a small stub that executes the ret instruction immediately. I have not had a chance to reverse it to see what purpose it serves — it was never even invoked when a breakpoint was set on it. Lastly, the syscall number (0x31) is put into eax and a jump to the KiServiceInternal routine is made. This routine, among other things, is responsible for setting the correct PreviousMode and traversing the Windows syscall table (commonly referred to as the System Service Dispatch Table, or SSDT) and invoking the native Nt* version of the API.

Getting Access to the APIs
So what is the relevance of all of this? The answer is that even though the kernel exports a ton of APIs for kernel/driver developers, there are still plenty of other ones which provide some pretty cool functionality — ones like ZwSuspendProcess/ZwResumeProcess, ZwReadVirtualMemory/ZwWriteVirtualMemory, etc, that are not available. Getting access to those APIs is really where this post begins. Before starting, there are several clear issues that need to be resolved:

  • The base address and image size in memory of the kernel (ntoskrnl) need to be found. This is obviously because the APIs lay somewhere within that memory region.
  • The syscalls need to be identified and there should be a generic way developed to allow us to invoke them.
  • Other issues related to using the APIs should be addressed. For example, process enumeration in the kernel in order to get a valid process handle for the target process in a ZwSuspend/ZwResume call.

Addressing these in order, the first point is relatively simple, but also relies on undocumented features. Getting the address of the kernel in memory is as simple as calling ZwQuerySystemInformation with the undocumented SYSTEM_INFORMATION_CLASS structure. What will be returned is a pointer to a SYSTEM_MODULE_INFORMATION structure containing a count of loaded modules in memory followed by the variable length array of SYSTEM_MODULE pointers. A quick note to add is that the NtInternals documentation on the structure is a bit outdated, and that the first two fields are of type ULONG_PTR instead of always a 32-bit ULONG. Finding the kernel base address and image size is simple a traversal of the SYSTEM_MODULE array and a substring search for the kernel name. The code is shown below:

PSYSTEM_MODULE GetKernelModuleInfo(VOID) {
 
    PSYSTEM_MODULE SystemModule = NULL;
    PSYSTEM_MODULE FoundModule = NULL;
    ULONG_PTR SystemInfoLength = 0;
    PVOID Buffer = NULL;
    ULONG Count = 0;
    ULONG i = 0;
    ULONG j = 0;
    //For names for WinXP
    CONST CHAR *KernelNames[] = { "ntoskrnl.exe", "ntkrnlmp.exe", "ntkrnlpa.exe", "ntkrpamp.exe" };
 
    //Perform error checking on the calls in actual code
    (VOID)ZwQuerySystemInformation(SystemModuleInformation, &SystemInfoLength, 0, &SystemInfoLength);
    Buffer = ExAllocatePool(NonPagedPool, SystemInfoLength);
    (VOID)ZwQuerySystemInformation(SystemModuleInformation, Buffer, SystemInfoLength, NULL);
 
    Count = ((PSYSTEM_MODULE_INFORMATION)Buffer)->ModulesCount;
    for(i = 0; i < Count; ++i) {         
        SystemModule = &((PSYSTEM_MODULE_INFORMATION)Buffer)->Modules[i];
        for(j = 0; j < sizeof(KernelNames) / sizeof(KernelNames[0]); ++j) {             
            if(strstr((LPCSTR)SystemModule->Name, KernelNames[j]) != NULL) {
                FoundModule = (PSYSTEM_MODULE)ExAllocatePool(NonPagedPool, sizeof(SYSTEM_MODULE));
                RtlCopyMemory(FoundModule, SystemModule, sizeof(SYSTEM_MODULE));
                ExFreePool(Buffer);
                return FoundModule;
             }
        }
    }
    DbgPrint("Could not find the kernel in module list\n");
    return NULL;
}

The above function will return the PSYSTEM_MODULE corresponding to information about the kernel (or NULL in the failure case). Now that the base address and image size of the kernel are known, it is possible to begin coming up with a way to invoke the undocumented syscalls.
Since all of the undocumented Zw* calls are nearly identical wrappers (with the exception of the syscall number) invoking KiSystemService, I present the generic way of invoking these calls by creating a functionality equivalent template of this in kernel memory and executing off of that. The general idea is to create a blank template such as the one shown below:

BYTE NullStub = 0xC3;
 
BYTE SyscallTemplate[] = {
    0x48, 0x8B, 0xC4,                                           /*mov rax, rsp*/
    0xFA,                                                       /*cli*/
    0x48, 0x83, 0xEC, 0x10,                                     /*sub rsp, 0x10*/
    0x50,                                                       /*push rax*/
    0x9C,                                                       /*pushfq*/
    0x6A, 0x10,                                                 /*push 0x10*/
    0x48, 0xB8, 0xAA, 0xAA, 0xAA, 0xAA, 0xAA, 0xAA, 0xAA, 0xAA, /*mov rax, NullStubAddress*/
    0x50,                                                       /*push rax*/
    0xB8, 0xBB, 0xBB, 0xBB, 0xBB,				/*mov eax, Syscall*/
    0x68, 0xCC, 0xCC, 0xCC, 0xCC,                               /*push LowBytes*/
    0xC7, 0x44, 0x24, 0x04, 0xCC, 0xCC, 0xCC, 0xCC,             /*mov [rsp+0x4], HighBytes*/
    0xC3							/*ret*/
};

in non paged memory, patch in the correct addresses (NullStub replacing KiServiceLinkage), patch in the syscall, then invoke KiSystemService (here done by moving the 64-bit absolute address on the stack and returning to it). Once fully patched at runtime, this data can simply be cased to the appropriate function pointer and invoked like normal. Here is the allocation and patching routine:

PVOID CreateSyscallWrapper(IN LONG Index) {
 
    PVOID Buffer = ExAllocatePool(NonPagedPool, sizeof(SyscallTemplate));
    BYTE *NullStubAddress = &NullStub;
    BYTE *NullStubAddressIndex = ((BYTE *)Buffer) + (14 * sizeof(BYTE));
    BYTE *SyscallIndex = ((BYTE *)Buffer) + (24 * sizeof(BYTE));
    BYTE *LowBytesIndex = ((BYTE *)Buffer) + (29 * sizeof(BYTE));
    BYTE *HighBytesIndex = ((BYTE *)Buffer) + (37 * sizeof(BYTE));
    ULONG LowAddressBytes = ((ULONG_PTR)KiSystemService) & 0xFFFFFFFF;
    ULONG HighAddressBytes = ((ULONG_PTR)KiSystemService >> 32);
    RtlCopyMemory(Buffer, SyscallTemplate, sizeof(SyscallTemplate));
    RtlCopyMemory(NullStubAddressIndex, (PVOID)&NullStubAddress, sizeof(BYTE *));
    RtlCopyMemory(SyscallIndex, &Index, sizeof(LONG));
    RtlCopyMemory(LowBytesIndex, &LowAddressBytes, sizeof(ULONG));
    RtlCopyMemory(HighBytesIndex, &HighAddressBytes, sizeof(ULONG));
    return Buffer;
}

Example usage of this is again shown below:

typedef NTSTATUS (NTAPI *pZwSuspendProcess)(IN HANDLE ProcessHandle);
pZwSuspendProcess ZwSuspendProcess = (pZwSuspendProcess)CreateSyscallWrapper(0x017A, 1);
//This can then be invoked as normal, e.g, ZwSuspendProcess(x);

However, before doing that, the address of KiServiceInternal needs to be found so it can be properly patched in. This is, after all, partially why finding the base address of the kernel was important. This is done through scanning for the function signature through the entirely of ntoskrnl’s memory. The signature must be sufficiently long as to be unique, but preferably not so long that comparisons take a lot of time. The signature that I used for this example is shown below:

typedef VOID (*pKiSystemService)(VOID);
pKiSystemService KiSystemService;
 
NTSTATUS ResolveFunctions(IN PSYSTEM_MODULE KernelInfo) {
    CONST BYTE KiSystemServiceSignature[] =
    {
        0x48, 0x83, 0xEC, 0x08, 0x55, 0x48, 0x81, 0xEC, 0x58, 0x01,
        0x00, 0x00, 0x48, 0x8D, 0xAC, 0x24, 0x80, 0x00, 0x00, 0x00,
        0x48, 0x89, 0x9D, 0xC0, 0x00, 0x00, 0x00, 0x48, 0x89, 0xBD,
        0xC8, 0x00, 0x00, 0x00, 0x48, 0x89, 0xB5, 0xD0, 0x00, 0x00,
        0x00, 0xFB, 0x65, 0x48, 0x8B, 0x1C, 0x25, 0x88, 0x01, 0x00,
        0x00
    };
    KiSystemService = (pKiSystemService)FindFunctionInModule(KiSystemServiceSignature,
        sizeof(KiSystemServiceSignature), KernelInfo->ImageBaseAddress, KernelInfo->ImageSize);
        if(KiSystemService == NULL) {
            DbgPrint("- Could not find KiSystemService\n");
            return STATUS_UNSUCCESSFUL;
        }
    DbgPrint("+ Found KiSystemService at %p\n", KiSystemService);
    //....
}
 
...
...
 
PVOID FindFunctionInModule(IN CONST BYTE *Signature, IN ULONG SignatureSize,
    IN PVOID KernelBaseAddress, IN ULONG ImageSize) {
 
    BYTE *CurrentAddress = 0;
    ULONG i = 0;
 
    DbgPrint("+ Scanning from %p to %p\n", KernelBaseAddress, (ULONG_PTR)KernelBaseAddress + ImageSize);
    CurrentAddress = (BYTE *)KernelBaseAddress;
 
    for(i = 0; i < ImageSize; ++i) {
        if(RtlCompareMemory(CurrentAddress, Signature, SignatureSize) == SignatureSize) {
            DbgPrint("+ Found function at %p\n", CurrentAddress);
            return (PVOID)CurrentAddress;
        }
    ++CurrentAddress;
    }
return NULL;
}

Once the ResolveFunctions() function executes, the CreateSyscallWrapper function is ready to be used as shown above. This will now resolve any syscall that you wish to call.

An Example

The code below is an example I wrote up showing how to write into the virtual address space of a target process. This process is given by name to the OpenProcess function, which retrieves the appropriate EPROCESS block corresponding to the process and opens a handle to it. This handle is then used in conjunction with the undocumented APIs associated with process manipulation (ZwSuspendProcess/ZwResumeProcess) and virtual memory manipulation (ZwProtectVirtualMemory/ZwWriteVirtualMemory). An internal undocumented function (PsGetNextProcess) is also scanned for and retrieved in order to help facilitate process enumeration. The code was written for and tested on an x86 version of Windows XP SP3 and x64 Windows 7 SP1.

#include "stdafx.h"
 
#include "Undocumented.h"
#include <wdm.h>
 
#ifdef __cplusplus
extern "C" NTSTATUS DriverEntry(IN PDRIVER_OBJECT DriverObject, IN PUNICODE_STRING  RegistryPath);
#endif
 
pPsGetProcessImageFileName PsGetProcessImageFileName;
pPsGetProcessSectionBaseAddress PsGetProcessSectionBaseAddress;
pPsGetNextProcess PsGetNextProcess;
pZwSuspendProcess ZwSuspendProcess;
pZwResumeProcess ZwResumeProcess;
pZwProtectVirtualMemory ZwProtectVirtualMemory;
pZwWriteVirtualMemory ZwWriteVirtualMemory;
pKiSystemService KiSystemService;
 
#ifdef _M_IX86
__declspec(naked) VOID SyscallTemplate(VOID) {
    __asm {
    /*B8 XX XX XX XX   */ mov eax, 0xC0DE
    /*8D 54 24 04      */ lea edx, [esp + 0x4]
    /*9C               */ pushfd
    /*6A 08            */ push 0x8
    /*FF 15 XX XX XX XX*/ call KiSystemService
    /*C2 XX XX         */ retn 0xBBBB
    }
}
#elif defined(_M_AMD64)
 
BYTE NullStub = 0xC3;
 
BYTE SyscallTemplate[] =
{
    0x48, 0x8B, 0xC4,                                           /*mov rax, rsp*/
    0xFA,                                                       /*cli*/
    0x48, 0x83, 0xEC, 0x10,                                     /*sub rsp, 0x10*/
    0x50,                                                       /*push rax*/
    0x9C,                                                       /*pushfq*/
    0x6A, 0x10,                                                 /*push 0x10*/
    0x48, 0xB8, 0xAA, 0xAA, 0xAA, 0xAA, 0xAA, 0xAA, 0xAA, 0xAA, /*mov rax, NullStubAddress*/
    0x50,                                                       /*push rax*/
    0xB8, 0xBB, 0xBB, 0xBB, 0xBB,                               /*mov eax, Syscall*/
    0x68, 0xCC, 0xCC, 0xCC, 0xCC,                               /*push LowBytes*/
    0xC7, 0x44, 0x24, 0x04, 0xCC, 0xCC, 0xCC, 0xCC,             /*mov [rsp+0x4], HighBytes*/
    0xC3                                                        /*ret*/
};
#endif
 
PVOID FindFunctionInModule(IN CONST BYTE *Signature,
    IN ULONG SignatureSize,
    IN PVOID KernelBaseAddress,
    IN ULONG ImageSize) {
 
    BYTE *CurrentAddress = 0;
    ULONG i = 0;
 
    DbgPrint("+ Scanning from %p to %p\n", KernelBaseAddress, (ULONG_PTR)KernelBaseAddress + ImageSize);
    CurrentAddress = (BYTE *)KernelBaseAddress;
    DbgPrint("+ Scanning from %p to %p\n", KernelBaseAddress, (ULONG_PTR)KernelBaseAddress + ImageSize);
    CurrentAddress = (BYTE *)KernelBaseAddress;
 
    for(i = 0; i < ImageSize; ++i) {
        if(RtlCompareMemory(CurrentAddress, Signature, SignatureSize) == SignatureSize) {
            DbgPrint("+ Found function at %p\n", CurrentAddress);
            return (PVOID)CurrentAddress;
        }
    ++CurrentAddress;
    }
    return NULL;
}
 
NTSTATUS ResolveFunctions(IN PSYSTEM_MODULE KernelInfo) {
 
    UNICODE_STRING PsGetProcessImageFileNameStr = {0};
    UNICODE_STRING PsGetProcessSectionBaseAddressStr = {0};
#ifdef _M_IX86
    CONST BYTE PsGetNextProcessSignature[] =
    {
        0x8B, 0xFF, 0x55, 0x8B, 0xEC, 0x51, 0x83, 0x65,
        0xFC, 0x00, 0x56, 0x57, 0x64, 0xA1, 0x24, 0x01, 0x00, 0x00,
        0x8B, 0xF0, 0xFF, 0x8E, 0xD4, 0x00, 0x00, 0x00, 0xB9, 0xC0,
        0x38, 0x56, 0x80, 0xE8, 0xB4, 0xEE, 0xF6, 0xFF, 0x8B, 0x45,
        0x08, 0x85, 0xC0
    };
#elif defined(_M_AMD64)
    CONST BYTE PsGetNextProcessSignature[] =
    {
        0x48, 0x89, 0x5C, 0x24, 0x08, 0x48, 0x89, 0x6C, 0x24, 0x10,
        0x48, 0x89, 0x74, 0x24, 0x18, 0x57, 0x41, 0x54, 0x41, 0x55,
        0x41, 0x56, 0x41, 0x57, 0x48, 0x83, 0xEC, 0x20, 0x65, 0x48,
        0x8B, 0x34, 0x25, 0x88, 0x01, 0x00, 0x00, 0x45, 0x33, 0xED,
        0x48, 0x8B, 0xF9, 0x66, 0xFF, 0x8E, 0xC6, 0x01, 0x00, 0x00,
        0x4D, 0x8B, 0xE5, 0x41, 0x8B, 0xED, 0x41, 0x8D, 0x4D, 0x11,
        0x33, 0xC0,
    };
#endif
#ifdef _M_IX86
    CONST BYTE KiSystemServiceSignature[] =
    {
        0x6A, 0x00, 0x55, 0x53, 0x56, 0x57, 0x0F, 0xA0, 0xBB, 0x30,
        0x00, 0x00, 0x00, 0x66, 0x8E, 0xE3, 0x64, 0xFF, 0x35, 0x00,
        0x00, 0x00, 0x00
    };
#elif defined(_M_AMD64)
    CONST BYTE KiSystemServiceSignature[] =
    {
        0x48, 0x83, 0xEC, 0x08, 0x55, 0x48, 0x81, 0xEC, 0x58, 0x01,
        0x00, 0x00, 0x48, 0x8D, 0xAC, 0x24, 0x80, 0x00, 0x00, 0x00,
        0x48, 0x89, 0x9D, 0xC0, 0x00, 0x00, 0x00, 0x48, 0x89, 0xBD,
        0xC8, 0x00, 0x00, 0x00, 0x48, 0x89, 0xB5, 0xD0, 0x00, 0x00,
        0x00, 0xFB, 0x65, 0x48, 0x8B, 0x1C, 0x25, 0x88, 0x01, 0x00,
        0x00
    };
#endif
    RtlInitUnicodeString(&PsGetProcessImageFileNameStr, L"PsGetProcessImageFileName");
    RtlInitUnicodeString(&PsGetProcessSectionBaseAddressStr, L"PsGetProcessSectionBaseAddress");
 
    PsGetProcessImageFileName = (pPsGetProcessImageFileName)MmGetSystemRoutineAddress(&PsGetProcessImageFileNameStr);
    if(PsGetProcessImageFileName == NULL) {
        DbgPrint("- Could not find PsGetProcessImageFileName\n");
        return STATUS_UNSUCCESSFUL;
    }
    DbgPrint("+ Found PsGetProcessImageFileName at %p\n", PsGetProcessImageFileName);
 
    PsGetProcessSectionBaseAddress = (pPsGetProcessSectionBaseAddress)MmGetSystemRoutineAddress(&PsGetProcessSectionBaseAddressStr);
    if(PsGetProcessSectionBaseAddress == NULL) {
        DbgPrint("- Could not find PsGetProcessSectionBaseAddress\n");
        return STATUS_UNSUCCESSFUL;
    }
    DbgPrint("+ Found PsGetProcessSectionBaseAddress at %p\n", PsGetProcessSectionBaseAddress);
 
    PsGetNextProcess = (pPsGetNextProcess)FindFunctionInModule(PsGetNextProcessSignature,
        sizeof(PsGetNextProcessSignature), KernelInfo->ImageBaseAddress, KernelInfo->ImageSize);
    if(PsGetNextProcess == NULL) {
        DbgPrint("- Could not find PsGetNextProcess\n");
        return STATUS_UNSUCCESSFUL;
    }
    DbgPrint("+ Found PsGetNextProcess at %p\n", PsGetNextProcess);
 
    KiSystemService = (pKiSystemService)FindFunctionInModule(KiSystemServiceSignature,
        sizeof(KiSystemServiceSignature), KernelInfo->ImageBaseAddress, KernelInfo->ImageSize);
    if(KiSystemService == NULL) {
        DbgPrint("- Could not find KiSystemService\n");
        return STATUS_UNSUCCESSFUL;
    }
    DbgPrint("+ Found KiSystemService at %p\n", KiSystemService);
 
    return STATUS_SUCCESS;
}
 
VOID OnUnload(IN PDRIVER_OBJECT DriverObject) {
 
    DbgPrint("+ Unloading\n");
}
 
PSYSTEM_MODULE GetKernelModuleInfo(VOID) {
 
    PSYSTEM_MODULE SystemModule = NULL;
    PSYSTEM_MODULE FoundModule = NULL;
    ULONG_PTR SystemInfoLength = 0;
    PVOID Buffer = NULL;
    ULONG Count = 0;
    ULONG i = 0;
    ULONG j = 0;
    //Other names for WinXP
    CONST CHAR *KernelNames[] = { "ntoskrnl.exe", "ntkrnlmp.exe", "ntkrnlpa.exe", "ntkrpamp.exe" };
 
    //Perform error checking on the calls in actual code
    (VOID)ZwQuerySystemInformation(SystemModuleInformation, &SystemInfoLength, 0, &SystemInfoLength);
    Buffer = ExAllocatePool(NonPagedPool, SystemInfoLength);
    (VOID)ZwQuerySystemInformation(SystemModuleInformation, Buffer, SystemInfoLength, NULL);
 
    Count = ((PSYSTEM_MODULE_INFORMATION)Buffer)->ModulesCount;
    for(i = 0; i < Count; ++i) {
        SystemModule = &((PSYSTEM_MODULE_INFORMATION)Buffer)->Modules[i];
        for(j = 0; j < sizeof(KernelNames) / sizeof(KernelNames[0]); ++j) {
            if(strstr((LPCSTR)SystemModule->Name, KernelNames[j]) != NULL) {
                FoundModule = (PSYSTEM_MODULE)ExAllocatePool(NonPagedPool, sizeof(SYSTEM_MODULE));
                RtlCopyMemory(FoundModule, SystemModule, sizeof(SYSTEM_MODULE));
                ExFreePool(Buffer);
                return FoundModule;
            }
        }
    }
    DbgPrint("Could not find the kernel in module list\n");
    return NULL;
}
 
PEPROCESS GetEPROCESSFromName(IN CONST CHAR *ImageName) {
 
    PEPROCESS ProcessHead = PsGetNextProcess(NULL);
    PEPROCESS Process = PsGetNextProcess(NULL);
    CHAR *ProcessName = NULL;
 
    do {
        ProcessName = PsGetProcessImageFileName(Process);
        DbgPrint("+ Currently looking at %s\n", ProcessName);
        if(strstr(ProcessName, ImageName) != NULL) {
            DbgPrint("+ Found the process -- %s\n", ProcessName);
            return Process;
        }
        Process = PsGetNextProcess(Process);
    } while(Process != NULL && Process != ProcessHead);
    DbgPrint("- Could not find %s\n", ProcessName);
    return NULL;
}
 
HANDLE GetProcessIdFromEPROCESS(PEPROCESS Process) {
 
    return PsGetProcessId(Process);
}
 
HANDLE OpenProcess(IN CONST CHAR *ProcessName, OUT OPTIONAL PEPROCESS *pProcess) {
 
    HANDLE ProcessHandle = NULL;
    CLIENT_ID ClientId = {0};
    OBJECT_ATTRIBUTES ObjAttributes = {0};
    PEPROCESS EProcess = GetEPROCESSFromName(ProcessName);
    NTSTATUS Status = STATUS_UNSUCCESSFUL;
 
    if(EProcess == NULL) {
        return NULL;
    }
    InitializeObjectAttributes(&ObjAttributes, NULL, OBJ_KERNEL_HANDLE, NULL, NULL);
    ObjAttributes.ObjectName = NULL;
    ClientId.UniqueProcess = GetProcessIdFromEPROCESS(EProcess);
    ClientId.UniqueThread = NULL;
 
    Status = ZwOpenProcess(&ProcessHandle, PROCESS_ALL_ACCESS, &ObjAttributes, &ClientId);
    if(!NT_SUCCESS(Status)) {
        DbgPrint("- Could not open process %s. -- %X\n", ProcessName, Status);
        return NULL;
    }
    if(pProcess != NULL) {
        *pProcess = EProcess;
    }
    return ProcessHandle;
}
 
PVOID CreateSyscallWrapper(IN LONG Index, IN SHORT NumParameters) {
 
#ifdef _M_IX86
    SIZE_T StubLength = 0x15;
    PVOID Buffer = ExAllocatePool(NonPagedPool, StubLength);
    BYTE *SyscallIndex = ((BYTE *)Buffer) + sizeof(BYTE);
    BYTE *Retn = ((BYTE *)Buffer) + (0x13 * (sizeof(BYTE)));
    RtlCopyMemory(Buffer, SyscallTemplate, StubLength);
    NumParameters = NumParameters * sizeof(ULONG_PTR);
    RtlCopyMemory(SyscallIndex, &Index, sizeof(LONG));
    RtlCopyMemory(Retn, &NumParameters, sizeof(SHORT));
    return Buffer;
#elif defined(_M_AMD64)
    PVOID Buffer = ExAllocatePool(NonPagedPool, sizeof(SyscallTemplate));
    BYTE *NullStubAddress = &NullStub;
    BYTE *NullStubAddressIndex = ((BYTE *)Buffer) + (14 * sizeof(BYTE));
    BYTE *SyscallIndex = ((BYTE *)Buffer) + (24 * sizeof(BYTE));
    BYTE *LowBytesIndex = ((BYTE *)Buffer) + (29 * sizeof(BYTE));
    BYTE *HighBytesIndex = ((BYTE *)Buffer) + (37 * sizeof(BYTE));
    ULONG LowAddressBytes = ((ULONG_PTR)KiSystemService) & 0xFFFFFFFF;
    ULONG HighAddressBytes = ((ULONG_PTR)KiSystemService >> 32);
    RtlCopyMemory(Buffer, SyscallTemplate, sizeof(SyscallTemplate));
    RtlCopyMemory(NullStubAddressIndex, (PVOID)&NullStubAddress, sizeof(BYTE *));
    RtlCopyMemory(SyscallIndex, &Index, sizeof(LONG));
    RtlCopyMemory(LowBytesIndex, &LowAddressBytes, sizeof(ULONG));
    RtlCopyMemory(HighBytesIndex, &HighAddressBytes, sizeof(ULONG));
    return Buffer;
#endif
}
 
VOID InitializeSyscalls(VOID) {
 
#ifdef _M_IX86
    ZwSuspendProcess = (pZwSuspendProcess)CreateSyscallWrapper(0x00FD, 1);
    ZwResumeProcess = (pZwResumeProcess)CreateSyscallWrapper(0x00CD, 1);
    ZwProtectVirtualMemory = (pZwProtectVirtualMemory)CreateSyscallWrapper(0x0089, 5);
    ZwWriteVirtualMemory = (pZwWriteVirtualMemory)CreateSyscallWrapper(0x0115, 5);
#elif defined(_M_AMD64)
    ZwSuspendProcess = (pZwSuspendProcess)CreateSyscallWrapper(0x017A, 1);
    ZwResumeProcess = (pZwResumeProcess)CreateSyscallWrapper(0x0144, 1);
    ZwProtectVirtualMemory = (pZwProtectVirtualMemory)CreateSyscallWrapper(0x004D, 5);
    ZwWriteVirtualMemory = (pZwWriteVirtualMemory)CreateSyscallWrapper(0x0037, 5);
#endif
}
 
VOID FreeSyscalls(VOID) {
 
    ExFreePool(ZwSuspendProcess);
    ExFreePool(ZwResumeProcess);
    ExFreePool(ZwProtectVirtualMemory);
    ExFreePool(ZwWriteVirtualMemory);
}
 
PVOID GetProcessBaseAddress(IN PEPROCESS Process) {
 
    return PsGetProcessSectionBaseAddress(Process);
}
 
NTSTATUS WriteToProcessAddress(IN HANDLE ProcessHandle, IN PVOID BaseAddress, IN BYTE *NewBytes, IN SIZE_T NewBytesSize) {
 
    ULONG OldProtections = 0;
    SIZE_T BytesWritten = 0;
    SIZE_T NumBytesToProtect = NewBytesSize;
    NTSTATUS Status = STATUS_UNSUCCESSFUL;
 
    //Needs error checking
    Status = ZwSuspendProcess(ProcessHandle);
    Status = ZwProtectVirtualMemory(ProcessHandle, &BaseAddress, &NumBytesToProtect, PAGE_EXECUTE_READWRITE, &OldProtections);
    Status = ZwWriteVirtualMemory(ProcessHandle, BaseAddress, NewBytes, NewBytesSize, &BytesWritten);
    Status = ZwProtectVirtualMemory(ProcessHandle, &BaseAddress, &NumBytesToProtect, OldProtections, &OldProtections);
    Status = ZwResumeProcess(ProcessHandle);
 
    return STATUS_SUCCESS;
}
 
NTSTATUS DriverEntry(IN PDRIVER_OBJECT DriverObject, IN PUNICODE_STRING  RegistryPath) {
 
    PSYSTEM_MODULE KernelInfo = NULL;
    PEPROCESS Process = NULL;
    HANDLE ProcessHandle = NULL;
    PVOID BaseAddress = NULL;
    BYTE NewBytes[0x100] = {0};
    NTSTATUS Status = STATUS_UNSUCCESSFUL;
 
    DbgPrint("+ Driver successfully loaded\n");
 
    DriverObject->DriverUnload = OnUnload;
 
    KernelInfo = GetKernelModuleInfo();
    if(KernelInfo == NULL) {
        DbgPrint("Could not find kernel module\n");
        return STATUS_UNSUCCESSFUL;
    }
    DbgPrint("+ Found kernel module.\n"
        "+ Name: %s -- Base address: %p -- Size: %p\n", KernelInfo->Name,
        KernelInfo->ImageBaseAddress, KernelInfo->ImageSize);
 
    if(!NT_SUCCESS(ResolveFunctions(KernelInfo))) {
        return STATUS_UNSUCCESSFUL;
    }
 
    InitializeSyscalls();
 
    ProcessHandle = OpenProcess("notepad.exe", &Process);
    if(ProcessHandle == NULL) {
        return STATUS_UNSUCCESSFUL;
    }
    BaseAddress = GetProcessBaseAddress(Process);
    if(BaseAddress == NULL) {
        return STATUS_UNSUCCESSFUL;
    }
 
    DbgPrint("Invoking\n");
    RtlFillMemory(NewBytes, sizeof(NewBytes), 0x90);
    (VOID)WriteToProcessAddress(ProcessHandle, BaseAddress, NewBytes, sizeof(NewBytes));
    DbgPrint("+ Done\n");
 
    ExFreePool(KernelInfo);
    FreeSyscalls();
    ZwClose(ProcessHandle);
 
    return STATUS_SUCCESS;
}

May 26, 2011

Quick Post: Auto-updating with Signature Scanning

Filed under: Game Hacking,General x86,General x86-64 — admin @ 2:38 AM

One common problem with developing hacks or external modifications for games/applications is when the target application gets modified through patches, new versions, or so on. This might render offsets, structures, functions, or anything important that is used in the hack as useless if it is hardcoded. For example, assume that the hack puts a hook on a function at 0x1234ABCD. One day, a new version of the application is released and the new compiled version no longer has this function at 0x1234ABCD, but it’s at some different address, 0x12345678. Now the hack no longer works, and in the worst case, even crashes the application when used. This becomes annoying because some applications are frequently updated, which in turn requires frequent updates on the part of the hack developer. Even if the updates aren’t too frequent, it can be unnecessarily inconvenient to hunt down where the structures, functions, and so on ended up. One solution to this is called signature scanning. This technique is nothing new or special and has been used by both hack developers and anti-virus programs for many years (anti-viruses probably much longer than in hacks). It relies on finding parts of a program through scanning for certain byte patterns. For example, anti-virus programs rely partially on signature scanning when they scan files since each virus or variant can be identified with a sequence of bytes unique to it. Byte strings from scanned programs are taken and hashed. This hash is compared with known virus hashes in a database and if there is a match then there is a good chance that the application is a virus or has been infected. This of course ignores additional heuristics and scanning methods incorporated into anti-virus programs, but is still at a very basic level a key component of how they all work. This same methodology can be applied to developing game hacks or external modifications to applications in general since functions, structures, and so on also have unique byte patterns identifying them.

Shown above is part of a function that could serve as a signature. No other function in the application performs this unique set of instructions so assuming that this function does not change (the actual code within it is modified or things like new optimization settings or compilers being used) then it can always be identified with \x55\x8B\xEC\x51\x6A\x10\...\xD9\x59\x04 regardless of any updates of patches to the application. However, this technique is not without its downsides: scanning an entire file or image is costly in terms of speed. Thus, it is a pretty bad idea to develop a hack that scans an image for a signature each time it is loaded since that can slow things down a lot. What I personally do is keep an external config file that holds signatures and the offset (RVA) into the image at which they’re located. Then when a hack is loaded it can read in the config file and check that the signature exists where it’s supposed to. If it doesn’t then the hack will perform a scan on the whole image and write back into the config file where the new signature exists. This is only one way of doing it though so to each their own. Since the implementation is just searching for a substring, I feel that there’s really no need to put one here. Important things to note though for developing signature scanners:

  • Signature scanners should have some wildcard usage built in. Whether EAX, EBX and so on is used to hold a temporary value is irrelevant. For example, MOV EAX, 123 as a byte string is B823010000 and MOV EBX, 123 is BB23010000. The important part of those instructions is the 123 immediate value, so the B8 or BB byte is irrelevant. The signature can then be \x??\x23\x01\x00\x00. How \x?? is treated is implementation dependent.
  • Usually the most important parts of a signature are any references to other code, structures, local variables, etc. Getting a signature containing these will increase the chance of it being found. However, references to other code is a bit dangerous since relative distances can change between new versions of a target application.
  • A signature, by definition, should be unique. Using PUSH EBP ; MOV EBP, ESP is a bad idea.

A downloadable PDF of this post can be found here.

April 17, 2011

Extending External Window Functionality

Filed under: General x86,General x86-64 — admin @ 4:12 PM

An interesting thing I saw many years ago was a plug-in for an old gaming client that added a lot of functionality. Some of the things were simple changes such as just modifying the resources to give the client a sleeker look. Other things were more interesting though, and included adding custom menus, both to the menu bar and context menus. This was interesting since it everything was contained in separate DLLs, which were loaded by the process on startup (parts of the executable were rewritten to jump to loader code). When finding out how this works, I found that it was pretty easy to do. Injected/loaded DLLs are within the address space of the process, so doing all of this is more or less similar to how it would be done when normally developing a GUI application with the Windows API. Unfortunately, this technique does not work as widespread as it did back then. Custom GUI APIs and new additions to Windows such as floating and dockable menus make this technique in its current form less useful. I wanted to put the code that I wrote several years ago on here for archive purposes. The code is old and could possibly be written to be a bit cleaner.

#include <Windows.h>
 
typedef struct _PROCESSWNDINFO {
    HWND hWnd;
    HMENU hMenuBar;
    HMENU hAddedMenu;
    LONG_PTR PrevWndProc;
} PROCESSWNDINFO, *LPPROCESSWNDINFO;
 
const DWORD MENUITEM_ID = 1234;
PROCESSWNDINFO g_WindowInfo;
 
LRESULT CALLBACK SubclassWndProc(HWND hWnd, UINT Msg, WPARAM wParam, LPARAM lParam) {
    switch(Msg) {
        case WM_COMMAND:
            switch(wParam) {
            case MENUITEM_ID:
                MessageBox(NULL, L"Added Handler!", L"New item", MB_ICONASTERISK);
                break;
            }
        break;
 
    default: break;
    }
    return CallWindowProc((WNDPROC)g_WindowInfo.PrevWndProc, hWnd, Msg, wParam, lParam);
}
 
BOOL CALLBACK EnumWindowProc(HWND hWnd, LPARAM processId) {
    const INT WINDOW_LENGTH = 32;
    WCHAR windowClass[WINDOW_LENGTH] = {0};
    GetClassName(hWnd, windowClass, sizeof(windowClass));
    if(wcscmp(windowClass, L"GDI+ Hook Window Class") == 0)
        return TRUE;
    DWORD windowProcessId = 0;
    (VOID)GetWindowThreadProcessId(hWnd, &windowProcessId);
    if(windowProcessId == (DWORD)processId) {
        g_WindowInfo.hWnd = hWnd;
        return FALSE;
    }
    return TRUE;
}
 
BOOL GetWindowProperties(LPPROCESSWNDINFO windowInfo) {
    EnumWindows((WNDENUMPROC)EnumWindowProc, (LPARAM)GetCurrentProcessId());
    if(windowInfo->hWnd != NULL) {
        windowInfo->hMenuBar = GetMenu(g_WindowInfo.hWnd);
        windowInfo->PrevWndProc = GetWindowLongPtr(windowInfo->hWnd, GWLP_WNDPROC);
        return TRUE;
    }
    return FALSE;
}
 
HMENU AddNewMenu(LPPROCESSWNDINFO windowInfo, LPCWSTR title) {
    HMENU hNewMenu = CreateMenu();
    if(hNewMenu != NULL && windowInfo->hMenuBar != NULL)
        if(AppendMenu(windowInfo->hMenuBar, MF_STRING | MF_POPUP, (UINT_PTR)hNewMenu, title) != 0)
            return hNewMenu;
    return NULL;
}
 
BOOL AddNewMenuItem(LPPROCESSWNDINFO windowInfo, HMENU hMenu, LPCWSTR title, const DWORD id) {
    BOOL ret = FALSE;
    if(hMenu != NULL)
        ret = AppendMenu(hMenu, MF_STRING, id, title);
    if(windowInfo->hMenuBar != NULL)
        DrawMenuBar(windowInfo->hWnd);
    return ret;
}
 
int APIENTRY DllMain(HMODULE hModule, DWORD reason, LPVOID reserved) {
    if(reason == DLL_PROCESS_ATTACH) {
        DisableThreadLibraryCalls(hModule);
        if(GetWindowProperties(&g_WindowInfo) == TRUE) {
            g_WindowInfo.hAddedMenu = AddNewMenu(&g_WindowInfo, L"Test");
            AddNewMenuItem(&g_WindowInfo, g_WindowInfo.hAddedMenu, L"Hello", MENUITEM_ID);
            g_WindowInfo.PrevWndProc = SetWindowLongPtr(g_WindowInfo.hWnd, GWLP_WNDPROC, (LONG_PTR)SubclassWndProc);
            }
        }
    return TRUE;
}

The success of the entire technique relies on the fact that GetMenu returns a valid handle to the menu. If this does not happen (the menu bar is actually not a standard menu), then the result is that nothing will happen. It is still possible to append menus/menu items in the case that the menu is not a standard menu. However, this involves reversing the application to see how menus are implemented and handled, or finding the documentation for the graphics API that is being used if it is available. What the code above does is find the window corresponding to the process identifier that this DLL is injected to or loaded from. Once the HWND is found, it is used to get the handle to the menu with

windowInfo->hMenuBar = GetMenu(g_WindowInfo.hWnd);

This handle is used to append the new menu to the menu bar (with AppendMenu). After that, any additional menu items can be appended to that added menu. SetWindowLongPtr is used to set a new window procedure, with the old one being stored to be called later. The handler for the menu items can be implemented in this callback like normal, with control being passed to the original window procedure at the end. One good thing about this technique is that is it done completely through the Windows API, ensuring 32-bit and 64-bit compatibility, ignoring minor details like using SetWindowLongPtr instead of SetWindowLong. If something like this was to be done through API hooks on the window procedure of the application, then there would be a hassle of finding a 32/64-bit compatible hooking library. This code was (re)tested on Windows 7 64-bit. Screenshots for 64/32-bit test applications are shown below — notice the added “Test” menu.

64-bit Minesweeper application:

32-bit DebugView application:

Source file can be found here
A downloadable PDF of this post can be found here

January 14, 2011

Five Minute Cracking: Hardcoded Expirations

Filed under: General x86-64,Reverse Engineering — admin @ 3:18 AM

I use a specific application (which won’t be named here) quite often. One of the annoying things about this application is that it is classified as shareware and continually pops up a nag screen on each instance to register, authenticate, and/or funnel money to them. Although the application still runs fine when this nag screen is closed — probably due to the developers taking a very fatalistic view on cracking, or them wanting to give their application more exposure (especially since free and possibly better alternatives exist). Admittedly, this screen becomes very annoying to see time and time again. Instead of switching over to the free alternative, I wanted to see how the nag screen works. Upon installation, there is a 40 day trial period that you, as a user, get to use this application. After this period is up, the nag screen begins to appear at each instance. This made me wonder what lengths the application goes to see if the user is using an expired version. I did a search for 40 in hex as 28h to see whether any comparisons come up with that value. Immediately something of interest popped up (string parameter blacked out since it identifies the program):
It immediately shows that the value in eax is compared against 28h and the jump to loc_1400942AF is taken if the value is greater. loc_1400942AF is cross referenced only once in the entire application (from the jump to it above), and it pops up a dialog with DialogBoxParam. This is literally the extent of the protection scheme of the application. Filling the jg instruction with NOPs is all that it takes to defeat it. Alternatively, it is possible to remove the dialog with a resource editor, which may be the simpler method since it doesn’t require looking at x86-64 ASM. The funny thing is that the application has a slightly complex key validation algorithm so writing a keygen for it would take a bit of time and skill; but since it is offered as a full version download (no removed features), keygenning it would be a bit pointless when you can NOP out an instruction. Why the application is designed this way remains a mystery, but it could be related to what I said above about piracy being inevitable and the developers wanting more exposure for their application versus (fully) free alternatives.

« Newer Posts

Powered by WordPress