Table of Contents:
The first step for developing an ESP hack is being able to draw information on the screen at a specified position. While this seems like it should be easy at first, it is actually a rather complicated process. This is because objects exist in different vector spaces than the one that corresponds to your screen.
There is the local space, which puts the object’s center at the origin; world space, which is a common space that all objects live in; view space, which centers the camera at the origin and looks forward; clip space, which map objects to a fixed-size plane so that clipping can be done; and lastly, screen space, which is the window that is the two dimensional (x, y) space that corresponds to the window.
Going from one vector space to another involves multiplying the coordinates in one space by a transformation matrix to get the coordinates in the other space. For example, if you have coordinates in world space and you want to map them to view space, you would multiply those coordinates by a transformation matrix called a view matrix. Because matrix multiplication is associative, you can create a matrix that performs multiple transformations in one multiplication operation. This is where the world-to-screen matrix comes in: this matrix is a combination of a view matrix and a projection matrix whose purpose is to take three dimensional world space coordinates and map them to a two dimensional clip space. You can then transform the coordinates in the clip space to normalized device coordinates and adjust for the screen’s aspect ratio to get (x, y) coordinates in screen space.
Games will store some, or all, of these transformation matrices somewhere. Typically you will be able to find a world-to-screen matrix directly instead of having to find the view and projection matrices since a world-to-screen transformation happens so often. The process of reverse engineering a game to find these matrices is rather tedious: you will spend a lot of time panning your camera view around while scanning the process memory for values that you expect. For example, if you are looking directly down followed by directly up, you might expect the matrix to have values in [0, 1] or [-1, 1] within it. There are other ways such as trying to derive the matrix from your camera’s position and angles, and then scanning for that as well. All of these approaches are rather involved and feature a lot of trial and error; there can be an entire series of posts dedicated to reverse engineering view matrices in games.
This tedium will be removed in this series because the Source SDK is open source and provides an interface that allows users to get the world to screen matrix. As before, we will get a pointer to this interface and be able to access the WorldToScreenMatrix function. This is done by scanning for “VEngineClient014” in the referenced strings of the running game. After attaching a debugger and searching, we can find it pretty quickly
Looking at how it’s used, we can easily obtain the function pointer to the global interface
As in the previous series, we can construct the function to retrieve the interface as such
IVEngineClient* GetClientEngine() {
constexpr auto globalGetClientEngineOffset{ 0xA3B30 };
static GetClientEngineFnc getClientEngineFnc{ GetFunctionPointer<GetClientEngineFnc>(
"engine.dll", globalGetClientEngineOffset) };
return getClientEngineFnc();
}
Now the fun can begin. We have the ability to get the world-to-screen matrix, but we don’t know how to perform the actual transformation. There is an example of the world-to-screen matrix being used to perform a transformation in the ScreenTransform function. This function passes in the world-to-screen matrix to the FrustumTransform function, which is the function that performs the actual transformation. The FrustumTransform function performs the matrix multiplication to transform between the vector spaces, and in this case, it will transform the input point in world space to a (x, y) position in clip space. To go from clip space to screen space, we need to see how ScreenTransform is called. Fortunately, there is a helpful function called GetVectorInScreenSpace that shows the viewport transformation to screen space.
For our purposes, we can lift these from the Source SDK and incorporate them into the ESP hack with minor modifications. FrustumTransformation will stay more or less as it was
bool FrustomTransform(const VMatrix& worldToSurface, const Vector3& point, Vector2& screen) {
screen.x = worldToSurface.m[0][0] * point.x + worldToSurface.m[0][1] * point.y + worldToSurface.m[0][2] * point.z + worldToSurface.m[0][3];
screen.y = worldToSurface.m[1][0] * point.x + worldToSurface.m[1][1] * point.y + worldToSurface.m[1][2] * point.z + worldToSurface.m[1][3];
auto w = worldToSurface.m[3][0] * point.x + worldToSurface.m[3][1] * point.y + worldToSurface.m[3][2] * point.z + worldToSurface.m[3][3];
bool facing{};
if (w < 0.001f)
{
facing = false;
screen.x *= 100000;
screen.y *= 100000;
}
else
{
facing = true;
float invw = 1.0f / w;
screen.x *= invw;
screen.y *= invw;
}
return facing;
}
and we can modify GetVectorInScreenSpace slightly to only return true for positions that are visible in screen space. The function has been renamed to WorldToScreen for more clarity as well
bool WorldToScreen(const Vector3& position, Vector2& screenPosition) {
auto worldToScreenMatrix{ GetClientEngine()->WorldToScreenMatrix() };
auto facing{ FrustomTransform(worldToScreenMatrix, position, screenPosition) };
int screenWidth{}, screenHeight{};
GetClientEngine()->GetScreenSize(screenWidth, screenHeight);
screenPosition.x = 0.5f * (1.0f + screenPosition.x) * screenWidth;
screenPosition.y = 0.5f * (1.0f - screenPosition.y) * screenHeight;
auto visible{ (screenPosition.x >= 0 && screenPosition.x <= screenWidth) &&
screenPosition.y >= 0 && screenPosition.y <= screenHeight };
if (!facing || !visible)
{
screenPosition.x = -640;
screenPosition.y = -640;
return false;
}
return true;
}
We can modify the aimbot code to print out the (x, y) screen coordinates of the closest enemy to the console to test this function out. After making the appropriate adjustments, we get the following results
which seem like legitimate numbers given the window size and resolution. The transformation appears to be successful! We are now able to translate an enemy’s world position to screen space. The first step in developing the ESP hack is done, now on to drawing.