BattlEye single stepping

With game-hacking always feeling like it is a step ahead of current anti-cheat technologies, developers from both sides are constantly trying to innovate and best one another. Continuing from our last post we will take a deeper look at another new technique that the defenders, namely BattlEye, have come up with to detect people whose motives are questionable. The feature in question is creative use of single stepping. To understand what that is first lets take a look at the definition from Intel:

In singlestep mode, the processor generates a debug exception after each instruction. This allows the execution state of a program to be inspected after each instruction.

This is a crucial feature in modern processor architecture, which allows very extensive analysis when debugging, but why would security suites such as BattlEye use it when they aren’t a debugger? As you can probably guess, this makes the overall execution very slow, as every single instruction executed will trap into a handler, resulting in a massive performance overhead.

One of the shellcodes that BattlEye streams and dynamically executes on millions of players’ machines each day is the one we refer to as shellcode8kb, due to its small size compared to shellcodemain. One feature of this shellcode is the registration of a vectored exception handler, which acts as a bare bones exception handler. The benefit of using such a feature is that it does not leave any traces in the program’s code. Anti-cheats heavily benefit from such stealth as they rely on security through obscurity, but this comes with a massive performance trade-off.

When this exception handler is registered in the game process, BattlEye will repeatedly place breakpoints on certain functions, which will then notify the aforementioned handler every time:

GetAsyncKeyState
GetCursorPos
IsBadReadPtr
NtUserGetAsyncKeyState
GetForegroundWindow
CallWindowProcW
NtUserPeekMessage
NtSetEvent
sqrtf
__stdio_common_vsprintf_s
CDXGIFactory::TakeLock
TppTimerpExecuteCallback

These functions are therefore completely monitored, and BattlEye will be alerted of any callers. Two of these functions are not like the rest; any call to NtUserPeekMessage or NtSetEvent modifies the processor flags register:

if (hooked_function == NtUserPeekMessage || hooked_function == NtSetEvent)
    exception->ContextRecord->EFlags |= TRAP_FLAG;

This hidden gem in BattlEye’s exception handler turns any thread that calls these two functions into slaves, calling back to BattlEye for each instruction execution. This allows BattlEye to essentially debug the game threads when these functions are used, and debugging is a very powerful tool for anti-cheats. This begs the question, what would they be looking for with this kind of information? Before we dive into the actual detection, let’s take a step back and look at one of the most groundbreaking releases in the game-hacking scene in recent years: The Perfect Injector by Can Bölük, a fellow secret club member. This is a widespread hacking tool that abuses a flaw in Windows address sanitization to hide cheat modules from anti-cheats:

0x7FFFFFFEFFFF marks the end of user-mode memory in Windows, but these constants are hard-coded by the operating system and are not what the processor actually uses to decide whether a page is accessible from usermode or not

By doing some advanced object manipulation in the kernel page table structure, you are able to use a part of the virtual memory space that would not otherwise be possible from usermode. This results in very undefined behaviour, where Windows system calls will simply fail if used with that corresponding memory range. This means that developers can’t simply query the memory using functions such as NtQueryVirtualMemory. As described by Can, the only thing that can access the memory is the processor, which rules out a lot of tools commonly used by developers. The great thing about processors is that they are very consistent, and even though this address range is declared unusable by Windows, doesn’t mean that the processor acts any differently; this is where the trap flag comes in to play. By setting the trap flag on threads that call these prominent system calls, BattlEye is able to monitor modules outside of the usable memory range, by simply checking the instruction pointer every time the single step occurs:

if (exception->ExceptionRecord->ExceptionCode == STATUS_SINGLE_STEP)
{
    if (exception->ContextRecord->Rip >= 0x8000000000000000) // UPPER HALF OF ADDRESS SPACE, BEGINNING OF KERNEL-MODE MEMORY
    {
        report_trap.report_id = 0x31;
        report_trap.hook_index = hook_index;
        report_trap.caller = exception->ContextRecord->Rip;
        report_trap.function_dump[0] = *(__int64*)report_trap.caller;
        report_trap.function_dump[1] = *(__int64*)(report_trap.caller + 8);
        report_trap.function_dump[2] = *(__int64*)(report_trap.caller + 16);
        report_trap.function_dump[3] = *(__int64*)(report_trap.caller + 24);
        battleye::report(&report_trap, sizeof(report_trap), false);
    }
}

If a single step happens in the upper half of the address space (which is reserved to kernel mode), this would mean that a hacker is deliberately hiding memory from the anti-cheat, which definitely deserves a closer look. When this is triggered, 32 bytes of the function is copied and sent to BattlEye’s servers for processing. This method itself is actually very innovative, and will catch users of The Perfect Injector, which was previously thought to be impossible from usermode. The performance hit in question is minimal, because the developers have refrained from resetting the trap flag on each single step. This means that the exception handler only deals with a single instruction after the function in question has been executed. This detection will catch cheaters hooking these commonly used functions from their hidden module.