Modern EDR Countermeasures: Fundamentals and Practical Guide to User-Mode Function Hooking

65 Views

In the field of Windows security offense and defense, Function Hooking is a core technology for EDR (Endpoint Detection and Response) to monitor process behavior and for attackers to bypass protections. To counter modern EDR interception, the first step is to master the operating mechanism of function hooking in user mode. Centered on the framework of “FUNCTION-HOOKING DLLS”, this article starts with basic concepts and memory layout, then delves into implementation processes, detection methods, and evasion techniques. Even readers new to this topic can follow along to set up experimental environments and verify code examples.

1. What is Function Hooking, and Why Do We Need It?

When a Windows application executes a system call, it does not interact directly with the kernel. Instead, it goes through multiple intermediate layers: User Code → Win32 API → ntdll.dll → Kernel. By setting a “Trampoline” at any of these layers, EDR tools or debuggers can monitor or even alter process behavior without modifying the application’s source code—that is the core value of function hooking.

Based on the location of modifications, common hooking techniques are divided into three categories, each with distinct application scenarios and characteristics:

Inline Hook: Modifies the first few bytes of instructions at the start of the target function. It has strong invasiveness but wide applicability, though it requires saving original instructions to prevent program crashes.
IAT Hook (Import Address Table Hook): Tampers with function pointers in a module’s import address table. It is completely transparent to the caller and suitable for monitoring API calls of a single module.
EAT Hook (Export Address Table Hook): Alters the export table of a DLL. It affects all process modules dependent on that DLL, offering the broadest coverage but carrying higher risks.

This article focuses on the most widely used Inline Hook. Understanding it only requires grasping three basic concepts:

Code segments are essentially modifiable memory: By adjusting memory page protection attributes via the VirtualProtect function, you can modify the machine code of any executable module.
x86/x64 instruction lengths are variable: Instruction lengths range from 1 to 15 bytes. When hooking, you must overwrite complete instructions to avoid illegal operations caused by truncated instructions.
A “trampoline” is mandatory for process redirection: Save the original instructions of the target function to a custom buffer (Trampoline). After executing custom logic, jump back to the remaining part of the original function via the trampoline.

2. Inline Hook Implementation Process: Taking CreateFileW as an Example

Using CreateFileW (a function for monitoring file creation) as an example, Inline Hook implementation consists of 4 key steps, each requiring strict adherence to memory operations and instruction logic:

Step 1: Modify Memory Page Protection Attributes

Call VirtualProtect to change the attribute of the memory page where CreateFileW resides to PAGE_EXECUTE_READWRITE, ensuring that jump instructions can be written later.

Step 2: Save Original Instructions and Build a Trampoline

Copy the first few bytes (usually 16 bytes, enough to cover multiple complete instructions) of CreateFileW and save them to the InlineHook::original structure. Meanwhile, add a jmp instruction to InlineHook::trampoline to jump back to the unmodified part of the original function.

Step 3: Write Jump Instructions to the Target Function

Write an unconditional jump instruction at the start of CreateFileW (commonly MOV RAX + JMP RAX for x64, or JMP rel32 for x86), redirecting function calls directly to custom processing logic.

Step 4: Complete the Hook Logic Loop

Subsequent calls to CreateFileW will first execute the custom processing function (e.g., recording file paths, identifying potentially dangerous operations), then jump back to the original function via the trampoline—ensuring the original functionality remains intact.

The following is the C++ implementation code for the core logic of Inline Hook, which also serves as the underlying foundation for hook frameworks like Detours:

#include <windows.h>

// Structure to store original instructions and trampoline
struct InlineHook {BYTE original[16];  // Saves the original starting instructions of the target function
    BYTE trampoline[32];// Trampoline: Original instructions + logic to jump back to the original function
};

// Install Inline Hook: target = address of the target function, handler = custom processing function, hook = structure for storing original instructions
bool InstallInlineHook(void* target, void* handler, InlineHook& hook) {
    DWORD oldProtect;
    // Step 1: Modify memory page protection to allow writing
    if (!VirtualProtect(target, sizeof(hook.original), PAGE_EXECUTE_READWRITE, &oldProtect))
        return false;

    // Step 2: Copy original instructions to 'original' and build the trampoline
    memcpy(hook.original, target, sizeof(hook.original));
    BYTE* t = hook.trampoline;
    memcpy(t, hook.original, sizeof(hook.original));  // Copy original instructions to the trampoline
    t += sizeof(hook.original);

    // Add to trampoline: MOV RAX, handler (store the address of the custom function in RAX)
    *t++ = 0x48; 
    *t++ = 0xB8;
    memcpy(t, &handler, sizeof(handler));
    t += sizeof(handler);

    // Add to trampoline: JMP RAX (jump to the custom function)
    *t++ = 0xFF; 
    *t++ = 0xE0;

    // Step 3: Write jump instruction (JMP rel32) at the start of the target function
    DWORD rel = (DWORD)((BYTE*)handler - (BYTE*)target - 5);  // Calculate relative offset
    BYTE patch[5] = {0xE9};  // 0xE9 is the opcode for the JMP rel32 instruction
    memcpy(patch + 1, &rel, sizeof(rel));
    memcpy(target, patch, sizeof(patch));

    // Step 4: Restore the original protection attribute of the memory page
    DWORD tmp;
    VirtualProtect(target, sizeof(hook.original), oldProtect, &tmp);
    return true;
}

3. A More Efficient Option: Implementing Hooks with Microsoft Detours

Manually writing Inline Hook often leads to issues like exception recovery, instruction length calculation, and thread safety. Microsoft’s open-source Detours library encapsulates these details into transactional APIs, significantly reducing development complexity while supporting edge scenarios such as WOW64 compatibility, exception filtering, and thread recovery.

The process of implementing hooks with Detours is fixed in 4 steps. Below is an example of intercepting CreateFileW to log file access:

Step 1: Start a Detours Transaction

Call DetourTransactionBegin to freeze the current thread’s execution flow, preventing thread race conditions during patching.

Step 2: Specify Affected Threads

Use DetourUpdateThread to inform Detours which thread to suspend. Typically, pass GetCurrentThread() (the current thread).

Step 3: Attach or Detach the Hook

Use DetourAttach to install the hook (bind the target function to the custom function) and DetourDetach to uninstall the hook.

Step 4: Commit the Transaction

Call DetourTransactionCommit to write all patches at once. After success, resume thread execution, and the hook takes effect.

The following is the complete Detours-based C++ code for intercepting CreateFileW, which can be compiled and run directly:

#include <windows.h>
#include <detours.h>
#include <iostream>

// 1. Define a pointer to the original function, pointing to the system's CreateFileW
static HANDLE(WINAPI* RealCreateFileW)(
    LPCWSTR lpFileName,
    DWORD dwDesiredAccess,
    DWORD dwShareMode,
    LPSECURITY_ATTRIBUTES lpSecurityAttributes,
    DWORD dwCreationDisposition,
    DWORD dwFlagsAndAttributes,
    HANDLE hTemplateFile
) = CreateFileW;

// 2. Implement the custom hook function: Log first, then call the original function
HANDLE WINAPI HookedCreateFileW(
    LPCWSTR lpFileName,
    DWORD dwDesiredAccess,
    DWORD dwShareMode,
    LPSECURITY_ATTRIBUTES lpSecurityAttributes,
    DWORD dwCreationDisposition,
    DWORD dwFlagsAndAttributes,
    HANDLE hTemplateFile
) {
    // Custom logic: Print the path of the accessed file
    std::wcout << L"[Hook Monitor] CreateFileW accessing file:" << lpFileName << std::endl;
    // Call the original function to ensure normal functionality
    return RealCreateFileW(
        lpFileName, dwDesiredAccess, dwShareMode, 
        lpSecurityAttributes, dwCreationDisposition, 
        dwFlagsAndAttributes, hTemplateFile
    );
}

// 3. Wrapper function for installing the hook
void InstallCreateFileHook() {DetourTransactionBegin();          // Start the transaction
    DetourUpdateThread(GetCurrentThread());  // Specify the current thread
    DetourAttach(&(PVOID&)RealCreateFileW, HookedCreateFileW);  // Bind the original function to the hook function
    DetourTransactionCommit();         // Commit the transaction to activate the hook}

// Test: Install the hook and call CreateFileW to verify log output
int main() {InstallCreateFileHook();  // Install the hook
    // Call CreateFileW, which will trigger HookedCreateFileW first
    HANDLE hFile = CreateFileW(
        L"C:\\temp\\demo.txt",  // Test file path
        GENERIC_READ,
        FILE_SHARE_READ,
        nullptr,
        OPEN_EXISTING,
        0,
        nullptr
    );
    if (hFile != INVALID_HANDLE_VALUE) {CloseHandle(hFile);
        std::wcout << L"[Test] File opened and closed successfully" << std::endl;
    }
    return 0;
}

4. A Critical Step for EDR: How to Inject Hook DLLs into Target Processes?

Hook code is usually encapsulated in a DLL. For the DLL to take effect, it must first be injected into the address space of the target process. EDRs commonly use two types of injection methods—user-mode and kernel-mode—each suited to different scenarios.

1. Traditional Injection Method: AppInit_DLLs (Gradually Obsolete)

Before Windows 8, many security tools used the AppInit_DLLs registry entry (path: HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Windows). Whenever a process loaded user32.dll, the system would automatically inject the DLLs listed in this registry.

This method is simple to implement and offers wide coverage but is easily abused by malware. It also slows down system startup. Since Windows 8, systems with Secure Boot enabled have completely disabled this mechanism, and it is now only used for compatibility scenarios.

2. Modern EDR Favorite: User-Mode Remote Thread Injection

Create a remote thread via CreateRemoteThread to make the target process load a specified DLL. This method is suitable for injecting into existing processes and follows a clear workflow:

Obtain the target process handle: Call OpenProcess to get a PROCESS_ALL_ACCESS handle for the target process.
Write the DLL path to the target process: Allocate memory in the target process using VirtualAllocEx, then write the full DLL path via WriteProcessMemory.
Create a remote thread to trigger loading: Make the remote thread execute LoadLibraryW, with the parameter being the DLL path written in Step 2. The DLL will then be loaded into the target process.
Wait for loading completion: Call WaitForSingleObject to wait for the remote thread to finish, ensuring the DLL is loaded successfully, and finally release resources.

The following is the C++ implementation code for remote thread injection, suitable for testing in experimental environments:

#include <windows.h>
#include <string>

// Inject a DLL into the process with the specified PID: pid = target process ID, path = full DLL path
bool InjectDll(DWORD dwProcessId, const std::wstring& strDllPath) {
    // Step 1: Obtain the target process handle
    HANDLE hProcess = OpenProcess(
        PROCESS_ALL_ACCESS,  // All permissions required to allocate memory and create threads
        FALSE,
        dwProcessId
    );
    if (!hProcess) return false;

    // Step 2: Calculate the DLL path length and allocate memory in the target process
    SIZE_T dwPathSize = (strDllPath.size() + 1) * sizeof(wchar_t);  // Include terminator
    LPVOID lpRemoteMem = VirtualAllocEx(
        hProcess,
        nullptr,
        dwPathSize,
        MEM_COMMIT | MEM_RESERVE,  // Commit and reserve memory
        PAGE_READWRITE  // Allow read/write (to write the DLL path)
    );
    if (!lpRemoteMem) {CloseHandle(hProcess);
        return false;
    }

    // Step 3: Write the DLL path to the target process memory
    if (!WriteProcessMemory(
        hProcess,
        lpRemoteMem,
        strDllPath.c_str(),
        dwPathSize,
        nullptr
    )) {VirtualFreeEx(hProcess, lpRemoteMem, 0, MEM_RELEASE);
        CloseHandle(hProcess);
        return false;
    }

    // Step 4: Create a remote thread to execute LoadLibraryW and load the DLL
    HANDLE hRemoteThread = CreateRemoteThread(
        hProcess,
        nullptr,
        0,
        (LPTHREAD_START_ROUTINE)LoadLibraryW,  // Thread entry: LoadLibraryW
        lpRemoteMem,  // Thread parameter: DLL path
        0,
        nullptr
    );
    bool bSuccess = (hRemoteThread != nullptr);

    // Step 5: Wait for the thread to finish and release resources
    if (hRemoteThread) {WaitForSingleObject(hRemoteThread, INFINITE);  // Wait for loading completion
        CloseHandle(hRemoteThread);
    }
    VirtualFreeEx(hProcess, lpRemoteMem, 0, MEM_RELEASE);  // Release target process memory
    CloseHandle(hProcess);
    return bSuccess;
}

3. A More Stealthy Method: Kernel-Mode KAPC Injection

User-mode injection may be blocked by the target process’s anti-injection policies (e.g., PROCESS_MITIGATION_DYNAMIC_CODE_POLICY). Therefore, modern EDRs increasingly use kernel-mode KAPC injection.

Its core logic is: The EDR driver subscribes to “process creation notifications”. When a new process starts, the driver allocates space in the target process memory and queues a Kernel Asynchronous Procedure Call (KAPC). When the target process’s thread resumes execution next time, Windows prioritizes executing the KAPC routine, which then calls LdrLoadDll or LoadLibraryW to load the hook DLL.

Since the logic runs in kernel mode, KAPC injection can bypass user-mode protections and insert monitoring at the earliest stage of process initialization, offering higher stealth and success rates.

5. Defender’s Perspective: How to Detect Function Hooking?

Whether building an EDR or securing an application, detecting function hooking is a core requirement. Common detection strategies are based on “memory comparison” and “integrity verification”. Below are 4 practical methods:

1. Byte Comparison Detection (Most Basic)

Read the starting bytes of the target function in memory and compare them with the original bytes of the DLL containing the function on disk. If jump instructions (e.g., 0xE9, 0xFFE0) or meaningless padding exist, the function may be hooked.

2. Function Integrity Verification

Calculate the MD5 or SHA256 hash of the memory page containing the target function and compare it with a pre-stored “clean” hash value. If the hashes do not match, the memory has been tampered with, indicating a potential hook risk.

3. Import Address Table (IAT) Verification

Traverse the process’s import address table and check if each function pointer points to the expected DLL module (e.g., kernel32.dll, ntdll.dll). If a pointer points to an unknown module or custom memory area, it is likely an IAT hook.

4. Memory Image Comparison

Load a read-only copy of the target DLL from disk (without executing code) and perform a byte-level comparison with the DLL already loaded in the process. Identify all memory differences to determine if hooking exists.

The following is the C++ implementation code for “byte comparison detection”, which can quickly determine if CreateFileW has been tampered with:

#include <windows.h>
#include <vector>

// Detect if a function is hooked: modulePath = DLL path, exportName = function name, inMemory = address of the function in memory
bool IsFunctionPatched(const wchar_t* lpModulePath, const char* lpExportName, void* pInMemory) {// Step 1: Load a read-only copy of the DLL from disk (without resolving dependencies to avoid code execution)
    HMODULE hDiskModule = LoadLibraryExW(
        lpModulePath,
        nullptr,
        DONT_RESOLVE_DLL_REFERENCES | LOAD_LIBRARY_AS_IMAGE_RESOURCE
    );
    if (!hDiskModule) return false;

    // Step 2: Obtain the address of the target function in the disk-based DLL
    FARPROC pDiskProc = GetProcAddress(hDiskModule, lpExportName);
    if (!pDiskProc) {FreeLibrary(hDiskModule);
        return false;
    }

    // Step 3: Compare the first 16 bytes of the function in memory and on disk
    BYTE byDiskBytes[16] = {0};
memcpy(byDiskBytes, pDiskProc, sizeof(byDiskBytes));
BYTE byMemBytes[16] = {0};
memcpy(byMemBytes, pInMemory, sizeof(byMemBytes));
// Step 4: Determine if bytes match; a mismatch indicates tamperingbool 
bPatched = (memcmp(byDiskBytes, byMemBytes, sizeof(byDiskBytes)) != 0);
FreeLibrary(hDiskModule);return bPatched;
}
// Test: Detect if CreateFileW in kernel32.dll is hookedvoid TestHookDetection() {void* pMemCreateFile = GetProcAddress(GetModuleHandleW(L"kernel32.dll"), "CreateFileW");
if (IsFunctionPatched(L"kernel32.dll", "CreateFileW", pMemCreateFile)) {printf("Warning: CreateFileW may be hooked!\n");
} else {printf("Normal: CreateFileW is not tampered with\n");
}
}

6. Attacker’s Perspective: How to Bypass EDR Function Hooking (Evading Hook)

When EDRs monitor critical APIs via hooking, attackers need to bypass these interception points. The three techniques below have both learning value and practical significance, but note: **All operations must be performed in authorized experimental environments; unauthorized use is prohibited**.

1. Direct System Calls (Direct Syscalls): Bypassing Win32 API Hooks

Win32 APIs like `WriteFile` and `CreateFileW` called by Windows applications ultimately pass through functions such as `NtWriteFile` and `NtCreateFile` in `ntdll.dll`, which trigger kernel services via the `syscall` instruction. By directly calling these “kernel interface functions” in `ntdll.dll`, you can bypass EDR hooks on Win32 APIs.

Two key points to implement Direct Syscalls:

Resolve System Service Numbers (SSN): On x64 systems, functions in `ntdll.dll` first write the system call number to the `EAX` register before executing `syscall`. SSNs may vary across Windows versions and need to be dynamically resolved from `ntdll.dll`.
Comply with Calling Conventions: Function parameters, calling conventions (e.g., `__fastcall` for x64), and exception handling must match official implementations; otherwise, program crashes or blue screens may occur.

Taking `NtWriteFile` as an example, its assembly logic (viewed via debugger for `ntdll!NtWriteFile`) is as follows:

mov r10, rcx    ; Backup RCX (x64 calling convention: RCX (first parameter) must be stored in R10 before syscall)
mov eax, 0x0055 ; Store the system call number 0x55 for NtWriteFile in EAX (may vary by version)
syscall         ; Trigger kernel-mode service and switch to kernel execution
ret             ; Return to user mode; RAX stores the NTSTATUS result

The following is the C++ implementation of directly calling NtWriteFile, which can bypass hooks on WriteFile:

#include <windows.h>

// Declare the NtWriteFile function prototype (exported by ntdll.dll; extern "C" avoids name mangling)
extern "C" NTSTATUS NtWriteFile(
    HANDLE hFile,
    HANDLE hEvent,
    PIO_APC_ROUTINE pApcRoutine,
    PVOID pApcContext,
    PIO_STATUS_BLOCK pIoStatusBlock,
    PVOID pBuffer,
    ULONG nLength,
    PLARGE_INTEGER pByteOffset,
    PULONG pKey
);

// Define the function pointer type for dynamic acquisition of NtWriteFile address later
typedef NTSTATUS(NTAPI* NtWriteFile_t)(
    HANDLE, HANDLE, PIO_APC_ROUTINE, PVOID, 
    PIO_STATUS_BLOCK, PVOID, ULONG, 
    PLARGE_INTEGER, PULONG
);

// Obtain the address of NtWriteFile from ntdll.dll
NtWriteFile_t ResolveNtWriteFile() {HMODULE hNtdll = GetModuleHandleW(L"ntdll.dll");
    return reinterpret_cast<NtWriteFile_t>(GetProcAddress(hNtdll, "NtWriteFile")
    );
}

// Directly call NtWriteFile to write to a file, bypassing hooks on WriteFile
bool DirectSyscallWrite(HANDLE hFile, const void* pBuffer, ULONG nSize) {auto pNtWriteFile = ResolveNtWriteFile();
    if (!pNtWriteFile) return false;

    IO_STATUS_BLOCK ioStatus = {0}; // Stores I/O operation results
    // Call NtWriteFile; parameters must strictly match the prototype
    NTSTATUS status = pNtWriteFile(
        hFile,          // Target file handle
        nullptr,        // No event notification needed; pass null
        nullptr,        // No APC callback needed; pass null
        nullptr,        // APC context; pass null
        &ioStatus,      // I/O status block
        const_cast<void*>(pBuffer), // Data to be written
        nSize,          // Data length
        nullptr,        // No offset specified (use current file pointer)
        nullptr         // No file key needed; pass null
    );
    // NTSTATUS >= 0 indicates successful operation
    return status >= 0;
}

2. Dynamic Resolution of System Service Numbers (SSN): Countering ntdll.dll Hooking

If EDRs hook not only Win32 APIs but also functions in ntdll.dll (e.g., NtWriteFile), the address obtained directly via GetProcAddress may be a tampered function address. In this case, you need to resolve the original SSN from a disk-based copy of ntdll.dll to avoid using the hooked module in memory.

The core steps are as follows:

Load a read-only copy of ntdll.dll from disk: Open C:\Windows\System32\ntdll.dll via CreateFileW, then map it to read-only memory using CreateFileMappingW and MapViewOfFile—no code execution.
Manually parse the PE structure to find the export table: Traverse the PE file’s export table to locate the Relative Virtual Address (RVA) of the target function (e.g., NtWriteFile).
Extract the system service number: Locate the assembly code based on the function’s RVA and extract the SSN (0xXX) from the mov eax, 0xXX instruction.

The following code implements the logic of resolving the export function address from the disk-based ntdll.dll, preparing for subsequent SSN extraction:

#include <windows.h>
#include <vector>
#include <cstring>

// Resolve the export function address from a disk-based DLL: path = DLL path, exportName = function name
void* ResolveExportFromDisk(const wchar_t* lpPath, const char* lpExportName) {
    // Step 1: Open the DLL file on disk
    HANDLE hFile = CreateFileW(
        lpPath,
        GENERIC_READ,        // Read-only permission
        FILE_SHARE_READ,     // Allow other processes to read
        nullptr,
        OPEN_EXISTING,       // Open an existing file
        FILE_ATTRIBUTE_NORMAL,
        nullptr
    );
    if (hFile == INVALID_HANDLE_VALUE) return nullptr;

    // Step 2: Create a file mapping and map the DLL to memory (read-only)
    HANDLE hMap = CreateFileMappingW(
        hFile,
        nullptr,
        PAGE_READONLY,       // Read-only attribute to prevent execution
        0, 0,               // Map the entire file
        nullptr
    );
    if (!hMap) {CloseHandle(hFile);
        return nullptr;
    }

    // Step 3: Load the mapped view into the current process memory
    BYTE* pBase = (BYTE*)MapViewOfFile(
        hMap,
        FILE_MAP_READ,       // Read-only access
        0, 0, 0             // Map the entire view
    );
    if (!pBase) {CloseHandle(hMap);
        CloseHandle(hFile);
        return nullptr;
    }

    // Step 4: Parse the PE structure to find the export table
    auto pDosHeader = (IMAGE_DOS_HEADER*)pBase;
    auto pNtHeader = (IMAGE_NT_HEADERS*)(pBase + pDosHeader->e_lfanew);
    // The export table is at index IMAGE_DIRECTORY_ENTRY_EXPORT in the data directory
    auto pExportDir = (IMAGE_EXPORT_DIRECTORY*)(pBase + pNtHeader->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].VirtualAddress
    );

    // Step 5: Traverse the export table to find the target function
    DWORD* pNames = (DWORD*)(pBase + pExportDir->AddressOfNames);       // Function name list
    WORD* pOrdinals = (WORD*)(pBase + pExportDir->AddressOfNameOrdinals); // Function ordinal list
    DWORD* pFuncs = (DWORD*)(pBase + pExportDir->AddressOfFunctions);     // Function RVA list

    void* pResult = nullptr;
    for (DWORD i = 0; i < pExportDir->NumberOfNames; ++i) {const char* pName = (char*)(pBase + pNames[i]);
        if (strcmp(pName, lpExportName) == 0) {
            // Find the function RVA based on the ordinal, then convert to a memory address
            DWORD dwFuncRva = pFuncs[pOrdinals[i]];
            pResult = pBase + dwFuncRva;
            break;
        }
    }

    // Step 6: Release resources
    UnmapViewOfFile(pBase);
    CloseHandle(hMap);
    CloseHandle(hFile);
    return pResult;
}

// Test: Obtain the address of NtWriteFile from the disk-based ntdll.dll
void TestResolveFromDisk() {
    void* pNtWriteFile = ResolveExportFromDisk(
        L"C:\\Windows\\System32\\ntdll.dll", 
        "NtWriteFile"
    );
    if (pNtWriteFile) {printf("Found NtWriteFile in disk-based ntdll.dll, address: %p\n", pNtWriteFile);
    } else {printf("Failed to resolve NtWriteFile\n");
    }
}

3. Remapping ntdll.dll: Using a “Clean” Module Copy

If ntdll.dll is heavily hooked, the above methods may fail. In this case, you can directly load a “clean” copy of ntdll.dll from disk and remap it to the current process memory, completely avoiding the hooked module in memory.

The core steps are as follows:

Load the disk-based ntdll.dll as an image: Use the SEC_IMAGE parameter of CreateFileMappingW to let the system automatically handle PE file section alignment and generate an executable image.
Copy to executable memory: Copy the image to a memory region with the PAGE_EXECUTE_READWRITE attribute to ensure code executability.
Fix relocations (optional): If the new memory address differs from the DLL’s original base address, correct pointers in memory based on the relocation table (can be ignored in simplified scenarios, but must be handled in complex scenarios).
Resolve export functions: Obtain the target function address from the newly mapped “clean” ntdll.dll and call it directly.

The following is the implementation code for remapping ntdll.dll, which can be used to obtain an unhooked NtWriteFile:

#include <windows.h>
#include <vector>
#include <string>

// Structure to store the base address and size of the remapped module
struct RemappedModule {
    BYTE* pBase;    // Module base address
    SIZE_T nSize;   // Total module size
};

// Remap the disk-based ntdll.dll to the current process memory
RemappedModule RemapNtdll() {RemappedModule mod = {nullptr, 0};
    wchar_t szSystemDir[MAX_PATH] = {0};
    // Get the system directory (e.g., C:\Windows\System32)
    GetSystemDirectoryW(szSystemDir, MAX_PATH);
    std::wstring strNtdllPath = szSystemDir + L"\\ntdll.dll";

    // Step 1: Open the disk-based ntdll.dll
    HANDLE hFile = CreateFileW(strNtdllPath.c_str(),
        GENERIC_READ,
        FILE_SHARE_READ,
        nullptr,
        OPEN_EXISTING,
        FILE_ATTRIBUTE_NORMAL,
        nullptr
    );
    if (hFile == INVALID_HANDLE_VALUE) return mod;

    // Step 2: Create a file mapping and specify SEC_IMAGE to let the system process the PE structure
    HANDLE hMap = CreateFileMappingW(
        hFile,
        nullptr,
        PAGE_READONLY | SEC_IMAGE,  // SEC_IMAGE: Map in PE image format
        0, 0,
        nullptr
    );
    if (!hMap) {CloseHandle(hFile);
        return mod;
    }

    // Step 3: Map the image view and obtain the PE image base address
    BYTE* pImageBase = (BYTE*)MapViewOfFile(
        hMap,
        FILE_MAP_READ,
        0, 0, 0
    );
    if (!pImageBase) {CloseHandle(hMap);
        CloseHandle(hFile);
        return mod;
    }

    // Step 4: Get the image size and allocate executable memory
    auto pDosHeader = (IMAGE_DOS_HEADER*)pImageBase;
    auto pNtHeader = (IMAGE_NT_HEADERS*)(pImageBase + pDosHeader->e_lfanew);
    SIZE_T nImageSize = pNtHeader->OptionalHeader.SizeOfImage;

    BYTE* pNewBase = (BYTE*)VirtualAlloc(
        nullptr,
        nImageSize,
        MEM_COMMIT | MEM_RESERVE,
        PAGE_EXECUTE_READWRITE  // Executable + read/write for copying the image
    );
    if (!pNewBase) {UnmapViewOfFile(pImageBase);
        CloseHandle(hMap);
        CloseHandle(hFile);
        return mod;
    }

    // Step 5: Copy the image to the new memory to complete remapping
    memcpy(pNewBase, pImageBase, nImageSize);
    mod.pBase = pNewBase;
    mod.nSize = nImageSize;

    // Step 6: Release temporary resources
    UnmapViewOfFile(pImageBase);
    CloseHandle(hMap);
    CloseHandle(hFile);
    return mod;
}

// Resolve export functions from the remapped module (template function supporting any function pointer type)
template <typename T>
T ResolveFromRemap(const RemappedModule& mod, const char* lpExportName) {if (!mod.pBase) return nullptr;

    // Parse the PE export table; logic is consistent with ResolveExportFromDisk
    auto pDosHeader = (IMAGE_DOS_HEADER*)mod.pBase;
    auto pNtHeader = (IMAGE_NT_HEADERS*)(mod.pBase + pDosHeader->e_lfanew);
    auto pExportDir = (IMAGE_EXPORT_DIRECTORY*)(mod.pBase + pNtHeader->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].VirtualAddress
    );

    DWORD* pNames = (DWORD*)(mod.pBase + pExportDir->AddressOfNames);
    WORD* pOrdinals = (WORD*)(mod.pBase + pExportDir->AddressOfNameOrdinals);
    DWORD* pFuncs = (DWORD*)(mod.pBase + pExportDir->AddressOfFunctions);

    for (DWORD i = 0; i < pExportDir->NumberOfNames; ++i) {const char* pName = (char*)(mod.pBase + pNames[i]);
        if (strcmp(pName, lpExportName) == 0) {DWORD dwFuncRva = pFuncs[pOrdinals[i]];
            return reinterpret_cast<T>(mod.pBase + dwFuncRva);
        }
    }
    return nullptr;
}

// Test: Remap ntdll.dll and obtain NtWriteFile
void TestRemapNtdll() {RemappedModule mod = RemapNtdll();
    if (!mod.pBase) {printf("Failed to remap ntdll.dll\n");
        return;
    }

    // Obtain NtWriteFile from the remapped module
    auto pNtWriteFile = ResolveFromRemap<NtWriteFile_t>(mod, "NtWriteFile");
    if (pNtWriteFile) {printf("Found NtWriteFile in remapped ntdll.dll, address: %p\n", pNtWriteFile);
    } else {printf("Failed to resolve NtWriteFile from the remapped module\n");
    }

    // Release memory when no longer in use
    VirtualFree(mod.pBase, 0, MEM_RELEASE);
}

7. Conclusion

This article focuses on user-mode Function Hooking, covering hook classification, manual Inline Hook implementation, Detours library usage, EDR DLL injection methods, and detection/evasion techniques from both offensive and defensive perspectives. Through specific C++ code examples, readers can set up experimental environments to verify each step of the logic and deepen their understanding of Windows underlying mechanisms.

For defenders (e.g., EDR developers),priority should be placed on the stealth of hooks and the comprehensiveness of detection. For example, combining kernel-mode monitoring with hardware-assisted virtualization (such as Intel VT-x/AMD-V) can achieve more reliable control flow protection. For offensive security researchers, mastering low-level technologies like system calls, PE parsing, and module remapping is essential to explore the boundaries of countermeasures in authorized scenarios.

Future research directions can focus on three areas:

Kernel-Mode Hook Technologies: Delving into kernel function hooking in ntoskrnl.exe, such as SSDT (System Service Descriptor Table) hooking and IDT (Interrupt Descriptor Table) hooking, to enable lower-level monitoring and countermeasures.
Hardware-Level Monitoring: Leveraging Trusted Execution Environments (TEE) like Intel SGX and AMD SEV, or utilizing the processor’s Performance Monitoring Unit (PMU), to implement control flow detection that cannot be bypassed by software.
Automated Analysis Tools: Developing automated tools for hook detection and anti-hooking. Through static analysis (PE file parsing) and dynamic instrumentation (memory snapshot comparison), these tools can quickly identify hook behaviors and generate countermeasure solutions.

Function hooking technology is the “cornerstone” of Windows security offense and defense. Only by deeply understanding its principles and boundaries can one gain the initiative in offense-defense confrontation. Whether building more secure protection systems or conducting compliant vulnerability research, mastery of this technology is indispensable.

END