This quick post will cover the topic of code streaming. For example, take malware. One way for malware to hide and persist on a system is to not contain any malicious code. This is done by getting the malicious payload through an external source, such as a direct request to a web server, a Twitter/social media post, a Pastebin, or any other common mechanism. This code, usually encrypted or obfuscated in some way, is then mapped in to the malicious process and executed. After execution, the memory region is cleaned up and reused or reallocated in order to carry out further malicious functionality. The code for this functionality looks pretty straightforward:
MemoryExecutor::MemoryExecutor(const size_t ulAllocSize) : m_ulAllocSize{ ulAllocSize } { m_pMemory = std::unique_ptr(new char[ulAllocSize]); } const bool MemoryExecutor::MapToRegion(const char * const pBytes, const size_t ulSize) { if (ulSize > m_ulAllocSize) { m_ulAllocSize = ulSize; m_pMemory = std::unique_ptr((char *)std::realloc(m_pMemory.get(), m_ulAllocSize)); if (m_pMemory.get() == nullptr) { return false; } } #ifdef DEBUG std::memset(m_pMemory.get(), 0xCC, ulSize); #endif std::memcpy(m_pMemory.get(), pBytes, ulSize); DWORD_PTR dwOldProtect = 0; return BOOLIFY(VirtualProtect(m_pMemory.get(), ulSize, PAGE_EXECUTE_READWRITE, (PDWORD)dwOldProtect)); } void MemoryExecutor::ExecuteRegion() { using pFnc = void (*)(); pFnc pRuntimeFunction = (pFnc)m_pMemory.get(); pRuntimeFunction(); memset(m_pMemory.get(), 0, m_ulAllocSize); m_pMemory.release(); } |
with the intention that pBytes in MapToRegion contains the malicious buffer. However, there are a few issues that come up, such as how to make WinAPI calls. The three solutions that I’ve seen to this come up in the wild are
- Map position-independent shellcode that traverses the DLL list and manually implements GetProcAddress. This is done by accessing the PEB structure created for each process. The PEB structure contains a pointer to a PEB_LDR_DATA structure, which in turn contains three lists: load order, memory order, and initialization order. These three lists contain all of the DLLs loaded in to the process via their base address. Once a base address for the desired DLL is obtained by traversing the list, it is possible to find its export section and traverse the export table. For an x86 assembly implementation, see here. This technique, in a mix of x86 and C, was also used by me in demonstrating how to write a file packer here.
- Set up registers/arguments and perform the native syscall. For example, the implementation of NtTerminateProcess on x64 looks like:
NtTerminateProcess:
00007FF998AE1040 4C 8B D1 mov r10,rcx
00007FF998AE1043 B8 2B 00 00 00 mov eax,2Bh
00007FF998AE1048 0F 05 syscall
00007FF998AE104A C3 ret
where the code in red is the syscall number. This approach is pretty volatile because syscall numbers can change across different Windows versions.
- Get the addresses of the DLLs from within the malware, via GetModuleHandle, and fix up the addresses manually when they’re mapped. It’s pretty sloppy, but I’ve seen it before.
As far as stealth goes, something like the code above is pretty easy to detect. The idea of code executing off the heap (after allocating and changing the page permissions) does set off the red flags. Other implementations that I’ve seen have been to
- Allocate executable pages upfront with VirtualAlloc. This is basically the same thing as above.
- Locate empty blocks of memory in executable pages and map the code there. These empty blocks of memory usually occur due to alignment reasons in the code and can be exploited to store the malicious functionality. This approach is pretty convenient since the memory page(s) will already have the appropriate permissions, and when executed, won’t look as suspicious as when executing off the heap.