Possibility of APC injection and KernelCallbackTable injection

There are many injection methods for processes under Windows
There are several more general ones like APC injection and KernelCallbackTable injection. Here is a combination of actual reference to github code and some optimized projects to realize it, and then better understand and learn these two injection methods and their operability.

APC Code Injection

First, let’s introduce some details of APC APC injection and the implementation process, such as enumerating threads, the difference between user mode and kernel mode APC, etc. This is convenient for a better understanding of the implementation principle and subsequent improvements of APC injection.
Windows kernel mode uses APC to complete asynchronously started I/O operations, thread suspension and other behaviors
APC is (Asynchronous Procedure Call) refers to asynchronous procedure call
APC allows user programs and system components to execute code in the context of a specific thread, so the code will be executed in the address space of a specific process. There are two main DLLs related to APC injection: Kernel32.dll and Ntdll.dll
The relevant functions mainly include the following

CreateToolhelp32Snapshot, Process32First, Process32Next, Thread32First, Thread32Next, OpenProcess, OpenThread, DuplicateHandle, GetCurrentProcess, WriteProcessMemory, VirtualProtectEx, QueueUserAPC, ResumeThread, NtAllocateVirtualMemory

The above functions will be used in the APC injection process. There are several functions that may not be common in other injection methods. It is necessary to understand the functions of the functions. For example, QueueUserAPC — adds a user-mode asynchronous procedure call (APC) object to the APC queue of the specified thread, and ResumeThread — decrements the thread’s suspension count. When the suspend count is decremented to zero, the thread’s execution will resume

Each thread has a queue that stores all APCs, threads can execute code in the process, and threads can use the APC queue to execute code asynchronously

Interruption

APC is divided into two types user mode APC and kernel mode APC
User-mode APC is executed in user space in the process context of the target thread, which requires the target thread to be in a changeable waiting state, and kernel-mode APC is executed in kernel space. At this time, it can be divided into regular APC and special APC.
Both kernel/user APCs have three functions:
● KernelRoutine: the function will be executed in kernel space (IRQL= PASSIVE_LEVEL if normal kernel APC and user APC, IRQL=APC_LEVEL if special kernel APC, thus creating thread with number to suspend all other CPUs on the system , and each thread raises the IRQL to DISPATCH_LEVEL, and then raises the IRQL on the current processor to DISPATCH_LEVEL, so that it will not be interrupted by the Windows kernel or any other driver, and since the APC is dispatched at APC_LEVEL or PASSIVE_LEVEL, the APC is dispatched at Does not change during APC enumeration.)
● RundownRoutine: If the thread terminates before reaching the APC, this function will be called in kernel space
● NormalRoutine: If it is a kernel mode APC, this function will be called in the kernel space, and if it is a user mode APC, it will be called in the user space.
Each thread has two members of type _KAPC_STATE in the _KTHREAD data structure, named ApcState and SavedApcState
● ApcState: Whether the thread is attached to its own process or another process is in use
● SavedApcState: used to store the APC of the process context that is not the current context and must wait (for example: when the thread is attached to another process, the APC is queued for its own process)
The _KAPC_STATE structure has a member called ApcListHead which is two LIST_ENTRY structures considered as the list head for kernel APCs and user APCs and will be used to queue APCs for threads
Windbg kernel debugging can get _KAPC_STATE

0: kd> dt nt!_KTHREAD
    + 0x000 Header : _DISPATCHER_HEADER
    + 0x018 SListFaultAddress : Ptr64 Void
................................................... .....
    + 0x098 ApcState : _KAPC_STATE
    + 0x098 ApcStateFill : [43] UChar
    + 0x0c3 Priority : Char
    + 0x0c4 UserIdealProcessor : Uint4B
    + 0x0c8 WaitStatus : Int8B
    + 0x0d0 WaitBlockList : Ptr64 _KWAIT_BLOCK
    + 0x0d8 WaitListEntry : _LIST_ENTRY
................................................... .....
    + 0x258 SavedApcState : _KAPC_STATE
    + 0x258 SavedApcStateFill : [43] UChar
    + 0x283 WaitReason : UChar
    + 0x284 SuspendCount : Char
    + 0x285 Saturation : Char

0: kd> dt nt!_KAPC_STATE
    + 0x000 ApcListHead : [2] _LIST_ENTRY ! Here is the queue head of kernel or user mode APC
    + 0x020 Process : Ptr64 _KPROCESS
    + 0x028 InProgressFlags : UChar
    + 0x028 KernelApcInProgress : Pos 0, 1 Bit
    + 0x028 SpecialApcInProgress : Pos 1, 1 Bit
    + 0x029 KernelApcPending : UChar
    + 0x02a UserApcPendingAll : UChar
    + 0x02a SpecialUserApcPending : Pos 0, 1 Bit
    + 0x02a UserApcPending : Pos 1, 1 Bit

(Threads execute code in-process Threads can utilize APC queues to execute code asynchronously Each thread has a queue storing all apcs Applications can queue APCs to a given thread (depending on privileges) )
Enumeration: Enumerate all thread IDs in the process
Now we know that the APC queue exists in the thread in the process, so we need to get the process thread list from the _KPROCESS structure, then go to the thread to get the _KTHREAD structure, then get the _KAPC_STATE structure from the _KTHREAD structure and then parse the kernel APC or user mode APC. But the problem is that with different versions of windows, the offset will change. If you make a mistake, it may cause BOSD! So this method requires us to get the offset values of different Windows versions

We can also enumerate thread IDs from user mode processes
Then we need to complete the acquisition of all thread IDs in the target process
two methods
● ZwQuerySystemInformation and SystemProcessInformation as SystemInformationClass parameter
● CreateToolhelp32Snaphot ->Thread32First ->Thread32Next
Enumerate thread ID codes (CreateToolhelp32Snaphot method shown here)

hThreadSnap = CreateToolhelp32Snapshot(TH32CS_SNAPTHREAD, 0);
    if (hThreadSnap == INVALID_HANDLE_VALUE)
        return(FALSE);
    te32.dwSize = sizeof(THREADENTRY32);

    if (!Thread32First(hThreadSnap, & amp;te32)) {<!-- -->
        //Error calling Thread32First
        CloseHandle(hThreadSnap);
        return(FALSE);
    }
    do
    {<!-- -->
        if (te32.th32OwnerProcessID == dwOwnerPID)
        {<!-- -->
            printf(TEXT("THREAD ID = 0x X"), te32.th32ThreadID);
            printf(TEXT("base priority = %d"), te32.tpBasePri);
            printf(TEXT("delta priority = %d"), te32.tpDeltaPri);

            Threadarray[counter]= te32.th32ThreadID;
            counter + + ;
        }
    } while (Thread32Next(hThreadSnap, & te32));

The above interpolation shows some pre-knowledge about APC in user mode and kernel mode, and also explains the structure used in the process of enumerating threads, etc., and then let’s take a look at the process of implementing APC injection.
Steps of APC injection (the process required to complete a standard apc injection)

First identify and find the process (PID) you want to inject
Allocate memory in the memory space of the process
Write the shellcode you prepared into the memory space you allocated
Then find and traverse all the threads in the process (the enumeration introduced in the above episode is easy to understand if it is implemented here)
Queue APC functions in all threads
Finally the APC function points to the put Shellcode (the thread resumes and executes the Shellcode)
Don’t confuse processes with threads here
Then when the thread in the process is called, it also means that the APC function we put in the thread queue will also be called, and the Shellcode will be executed at this time
But there is a flaw in this method that the malicious program cannot force the victim thread to execute the injected code.
But it can also fix this defect. Its method is called Early Bird APC Queue Code Injection
Early Bird APC Queue Code Injection differs from traditional APC Code Injection in that it occurs during process initialization
That is to create a new legal process in the suspended state

BOOL creationResult;
    creationResult = CreateProcess(
        NULL, // No module name (use command line)
        cmdLine, //Command line
        NULL, // Process handle not inheritable
        NULL, // Thread handle not inheritable
        FALSE, // Set handle inheritance to FALSE
        NORMAL_PRIORITY_CLASS | CREATE_NEW_CONSOLE | CREATE_NEW_PROCESS_GROUP, // creation flags
        NULL, // Use parent's environment block
        NULL, // Use parent's starting directory
         & amp;startupInfo, // Pointer to STARTUPINFO structure
         & amp;processInformation); // Pointer to PROCESS_INFORMATION structure

In this way, when we perform APC injection, the thread is always in the suspended state.
Then, because the APC injection design modifies the memory area for storing data, it is necessary to modify the protection attribute, and then roughly talk about the meaning of the protection attribute.
The memory page protection attributes are PAGE_NOACCESS, PAGE_READONLY, PAGE_READWRITE, PAGE_EXECUTE, PAGE_EXECUTE_READ, PAGE_EXECUTE_READWRITE, PAGE_WRITECOPY, PAGE_EXECUTE_WRITECOPY.
Some malware writes code to areas of memory used for data (such as on thread stacks), and in this way makes the application execute malicious code. The Windows Data Execution Prevention feature provides protection against such malicious attacks. If DEP is enabled, the operating system pageexecute* protects attributes only for memory areas that actually need to be executed. Other protection attributes (the most common being PAGE_READWRITE) are used for areas of memory that should only hold data.
Final core code implementation

CreateProcessA(NULL, (LPSTR)targetexe, NULL, NULL, FALSE, CREATE_SUSPENDED, NULL, NULL, startInfo, procInfo)//Create the target process in suspend mode
VirtualAllocEx(procInfo->hProcess, NULL, payloadSize, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE)//Allocate memory in the remote process of the protection attribute PAGE_READWRITE
VirtualProtectEx(procInfo->hProcess, baseAddress, payloadSize, PAGE_EXECUTE_READ, & amp;oldProtect)//Change the memory protection property of the allocated memory from PAGE_READWRITE to PAGE_EXECUTE_READ
Setup program (APC program)
QueueUserAPC((PAPCFUNC)tRoutine, procInfo->hThread, 0)//put the payload into the APC queue
ResumeThread(procInfo->hThread)//resume thread
In this way, the process created in the suspended state at the beginning will start to execute the trigger APC function

Injection completed successfully
![Insert picture description here](https://img-blog.csdnimg.cn/bc9c0c6cbec740099097bfc20d59faae.png)

![](https://img-blog.csdnimg.cn/e4d6221ff3f64c5f9eb39e601083370a.png

## KernelCallbackTable injection

KernelCallbackTable injection can be used to inject shellcode in a remote process. KernelCallbackTable can be found in PEB. It is used by KeUserModeCallback. Calling KeUserModeCallback in kernel mode can execute the corresponding function in KernelCallbackTable in user mode. Moreover, many privilege escalation CVE vulnerabilities involve Hook KernelCallbackTable, which is a callback process in Windows. Like CVE-2018-8453, etc., so KernelCallbackTable is a relatively important concept in windows
The general injection process is: use VirtualAllocEx and WriteProcessMemory to write data, use NtQueryInformationProcess to obtain the PEB address of the target process, and read and find the location of the kernel callback table, write a new kernel callback table, modify the address of fnCOPYDATA to the shellcode entry, in Triggered by sending a WM_COPYDATA message to the window owned by the object in the target process.
The relevant functions mainly include the following

```c
CreateProcess, WaitForInputIdle, FindWindow, GetWindowThreadProcessId, ReadProcessMemory, VirtualAllocEx, WriteProcessMemory, SendMessage, NtQueryInformationProcess

KernelCallbackTable injection is used to run shellcode after injection, sometimes in other processes, basically using KeUserModeCallback or __fnCOPYDATA in the KERNELCALLBACKTABLE structure

The structure of KernelCallbackTable is very important in KernelCallbackTable injection. The __fnCOPYDATA mentioned above is in the following structure

typedef struct _KERNELCALLBACKTABLE_T {<!-- -->
    ULONG_PTR __fnCOPYDATA;
    ULONG_PTR## header __fnCOPYGLOBALDATA;
    ULONG_PTR__fnDWORD;
    ULONG_PTR__fnNCDESTROY;
    ULONG_PTR__fnDWORDOPTINLPMSG;
    ULONG_PTR __fnINOUTDRAG;
    ULONG_PTR__fnGETTEXTLENGTHS;
    ULONG_PTR __fnINCNTOUTSTRING;
    ULONG_PTR __fnPOUTLPINT;
    ULONG_PTR __fnINLPCOMPAREITEMSTRUCT;
    ULONG_PTR __fnINLPCREATESTRUCT;
    ULONG_PTR __fnINLPDELETEITEMSTRUCT;
    ULONG_PTR __fnINLPDRAWITEMSTRUCT;
    ULONG_PTR __fnPOPTINLPUINT;
    ULONG_PTR __fnPOPTINLPUINT2;
    ULONG_PTR __fnINLPMDICREATESTRUCT;
    ULONG_PTR __fnINOUTLPMEASUREITEMSTRUCT;
    ULONG_PTR __fnINLPWINDOWPOS;
    ULONG_PTR __fnINOUTLPPOINT5;
    ULONG_PTR __fnINOUTLPSCROLLINFO;
    ULONG_PTR __fnINOUTLPRECT;
    ULONG_PTR __fnINOUTNCCALCSIZE;
    ULONG_PTR __fnINOUTLPPOINT5_;
    ULONG_PTR __fnINPAINTCLIPBRD;
    ULONG_PTR __fnINSIZECLIPBRD;
    ULONG_PTR __fnINDESTROYCLIPBRD;
    ULONG_PTR __fnINSTRING;
    ULONG_PTR __fnINSTRINGNULL;
    ULONG_PTR __fnINDEVICECHANGE;
    ULONG_PTR __fnPOWERBROADCAST;
    ULONG_PTR __fnINLPUAHDRAWMENU;
    ULONG_PTR __fnOPTOUTLPDWORDOPTOUTLPDWORD;
    ULONG_PTR __fnOPTOUTLPDWORDOPTOUTLPDWORD_;
    ULONG_PTR __fnOUTDWORDINDWORD;
    ULONG_PTR __fnOUTLPRECT;
    ULONG_PTR __fnOUTSTRING;
    ULONG_PTR __fnPOPTINLPUINT3;
    ULONG_PTR __fnPOUTLPINT2;
    ULONG_PTR __fnSENTDDEMSG;
    ULONG_PTR __fnINOUTSTYLECHANGE;
    ULONG_PTR __fnHkINDWORD;
    ULONG_PTR __fnHkINLPBBTACTIVATESTRUCT;
    ULONG_PTR __fnHkINLPBTCREATESTRUCT;
    ULONG_PTR __fnHkINLPDEBUGHOOKSTRUCT;
    ULONG_PTR __fnHkINLPMOUSEHOOKSTRUCTEX;
    ULONG_PTR __fnHkINLPKBDLLHOOKSTRUCT;
    ULONG_PTR __fnHkINLPMSLLHOOKSTRUCT;
    ULONG_PTR __fnHkINLPMSG;
    ULONG_PTR__fnHkINLPRECT;
    ULONG_PTR __fnHkOPTINLPEVENTMSG;
    ULONG_PTR __xxxClientCallDelegateThread;
    ULONG_PTR __ClientCallDummyCallback;
    ULONG_PTR __fnKEYBOARDCORRECTIONCALLOUT;
    ULONG_PTR __fnOUTLPCOMBOBOXINFO;
    ULONG_PTR __fnINLPCOMPAREITEMSTRUCT2;
    ULONG_PTR __xxxClientCallDevCallbackCapture;
    ULONG_PTR __xxxClientCallDitThread;
    ULONG_PTR __xxxClientEnableMMCSS;
    ULONG_PTR __xxxClientUpdateDpi;
    ULONG_PTR __xxxClientExpandStringW;
    ULONG_PTR __ClientCopyDDEIn1;
    ULONG_PTR __ClientCopyDDEIn2;
    ULONG_PTR __ClientCopyDDEOut1;
    ULONG_PTR __ClientCopyDDEOut2;
    ULONG_PTR __ClientCopyImage;
    ULONG_PTR __ClientEventCallback;
    ULONG_PTR __ClientFindMnemChar;
    ULONG_PTR __ClientFreeDDEHandle;
    ULONG_PTR __ClientFreeLibrary;
    ULONG_PTR __ClientGetCharsetInfo;
    ULONG_PTR __ClientGetDDEFlags;
    ULONG_PTR __ClientGetDDEHookData;
    ULONG_PTR __ClientGetListboxString;
    ULONG_PTR __ClientGetMessageMPH;
    ULONG_PTR __ClientLoadImage;
    ULONG_PTR__ClientLoadLibrary;
    ULONG_PTR __ClientLoadMenu;
    ULONG_PTR __ClientLoadLocalT1Fonts;
    ULONG_PTR __ClientPSMTextOut;
    ULONG_PTR __ClientLpkDrawTextEx;
    ULONG_PTR __ClientExtTextOutW;
    ULONG_PTR __ClientGetTextExtentPointW;
    ULONG_PTR __ClientCharToWchar;
    ULONG_PTR __ClientAddFontResourceW;
    ULONG_PTR __ClientThreadSetup;
    ULONG_PTR __ClientDeliverUserApc;
    ULONG_PTR __ClientNoMemoryPopup;
    ULONG_PTR __ClientMonitorEnumProc;
    ULONG_PTR __ClientCallWinEventProc;
    ULONG_PTR __ClientWaitMessageExMPH;
    ULONG_PTR __ClientWOWGetProcModule;
    ULONG_PTR __ClientWOWTask16SchedNotify;
    ULONG_PTR __ClientImmLoadLayout;
    ULONG_PTR __ClientImmProcessKey;
    ULONG_PTR __fnIMECONTROL;
    ULONG_PTR __fnINWPARAMDBCSCHAR;
    ULONG_PTR__fnGETTEXTLENGTHS2;
    ULONG_PTR __fnINLPKDRAWSWITCHWND;
    ULONG_PTR __ClientLoadStringW;
    ULONG_PTR __ClientLoadOLE;
    ULONG_PTR __ClientRegisterDragDrop;
    ULONG_PTR __ClientRevokeDragDrop;
    ULONG_PTR __fnINOUTMENUGETOBJECT;
    ULONG_PTR__ClientPrinterThunk;
    ULONG_PTR __fnOUTLPCOMBOBOXINFO2;
    ULONG_PTR __fnOUTLPSCROLLBARINFO;
    ULONG_PTR __fnINLPUAHDRAWMENU2;
    ULONG_PTR __fnINLPUAHDRAWMENUITEM;
    ULONG_PTR __fnINLPUAHDRAWMENU3;
    ULONG_PTR __fnINOUTLPUAHMEASUREMENUITEM;
    ULONG_PTR __fnINLPUAHDRAWMENU4;
    ULONG_PTR __fnOUTLPTITLEBARINFOEX;
    ULONG_PTR __fnTOUCH;
    ULONG_PTR __fnGESTURE;
    ULONG_PTR __fnPOPTINLPUINT4;
    ULONG_PTR __fnPOPTINLPUINT5;
    ULONG_PTR __xxxClientCallDefaultInputHandler;
    ULONG_PTR __fnEMPTY;
    ULONG_PTR __ClientRimDevCallback;
    ULONG_PTR __xxxClientCallMinTouchHitTestingCallback;
    ULONG_PTR __ClientCallLocalMouseHooks;
    ULONG_PTR __xxxClientBroadcastThemeChange;
    ULONG_PTR __xxxClientCallDevCallbackSimple;
    ULONG_PTR __xxxClientAllocWindowClassExtraBytes;
    ULONG_PTR __xxxClientFreeWindowClassExtraBytes;
    ULONG_PTR__fnGETWINDOWDATA;
    ULONG_PTR __fnINOUTSTYLECHANGE2;
    ULONG_PTR __fnHkINLPMOUSEHOOKSTRUCTEX2;
} KERNELCALLBACKTABLE;

step:
● Generate payload and store payload
● Retrieve the handle of the window, match the class name of the window with the window name and the string (this function does not retrieve sub-windows), and then return the handle of the window with the specified class name and window name //FindWindow(L”Shell_TrayWnd”, NULL );
● Retrieves the identifier of the thread that created the specified window, and optionally the identifier of the process that created the window. Returns the identifier of the thread that created the window //GetWindowThreadProcessId(hWindow, & amp;pid)
● Read addresses of PEB and KernelCallBackTable
● write the new table to the remote process
● Update PEB and trigger payload
● restore the original KernelCallbackTable
● free memory
● close the handle
After completing the above steps and implementing the code, you find that the operation is not completed because the calculator does not pop up

The above poc target process is explorer.exe, and the result fails
Then we can try another target process, such as Notepad.exe?

Just change a process, so it is really a good choice to use the Notepad process when testing
Why are there other processes running on the explorer.exe system?
Because the one found in the PEB is only used in the GUI process, it will be initialized when it is loaded into the memory of the process KernelCallbackTable
Problems in the code: explorer.exe crashes immediately when updating the PEB of the target process, restarting explorer.exe after the crash will cause the obtained window handle to be invalid, and finally cause the SendMessage function call to fail. So when we inject explorer.exe, we will find that it will be restored after a flash. It is because of the restart after the crash that the injected code failed to be cleaned up.
Why is there a problem?
Because we must first enumerate the window classes available on the system (this is possible with the EnumWindows() function.)
This will cause the target process to crash (the crash is visible to the user)
how to solve this problem
It can be solved by not locating explorer.exe and loading user32.dll into the memory, but if user32.dll is loaded into the current memory, the payload will be executed locally. But it cannot be injected into another process (also cannot remote process injection)
So since a process crash is inevitable, a process that is invisible to the user can be generated, so that even if it crashes, it will have no effect

CreateProcess(L"C:\Windows\System32\\
otepad.exe", NULL, NULL, NULL, FALSE, CREATE_SUSPENDED, NULL, NULL, &si, & amp;pi);
Set process creation flags dwFlags to CREATE_SUSPENDED "hidden"

You can see that this line of code is the same as what we used when we introduced APC injection above. Therefore, it is a very good choice to hide the process suspension state when implementing process injection.
But the process created by this method is suspended without any window, that is, we cannot get the handle without the window. In this way, subsequent injection and payload execution cannot be completed. Because our APC injection does not need to obtain a handle (only need to obtain the process pid, and enumerate threads) to complete the injection
Then we need to find a way to get the handle
We can use the STARTUPINFOA structure, which can help us specify the window station, desktop, standard handle, etc. of the main window of the process when creating it

typedef struct _STARTUPINFOA {<!-- -->
  DWORD cb;
  LPSTR lpReserved;
  LPSTR lpDesktop;
  LPSTR lpTitle;
  DWORD dwX;
  DWORD dwY;
  DWORD dwXSize;
  DWORD dwYSize;
  DWORD dwXCountChars;
  DWORD dwYCountChars;
  DWORD dwFillAttribute;
  DWORD dwFlags;
  WORD wShowWindow;
  WORD cbReserved2;
  LPBYTE lpReserved2;
  HANDLE hStdInput;
  HANDLE hStdOutput;
  HANDLE hStdError;
} STARTUPINFOA, *LPSTARTUPINFOA;

(STARTF_USESHOWWINDOW //The wShowWindow member contains additional information.)
()
Set dwFlags, wShowWindow member
First set dwFlags to STARTF_USESHOWWINDOW so that wShowWindow information can be obtained

Then set wShowWindow to SW_HIDE it is depends on the visibility of the window
Then change CREATE_SUSPENDED to CREATE_NEW_CONSOLE

This way the process is invisible to the user and has a window. However, running the code still doesn’t get any handle
The reason why debugging fails is that the created process has not had time to initialize its input, and indeed because we created an invisible window to display, and the new process has a new console. So you need to wait for the process initialization to complete before executing the subsequent code.
WaitForInputIdle(Process, 1000) is perfectly done. This function will wait until the process is initialized and then continue to execute.

When executed, the UI is completely injected without seeing anything.
In fact, we can find that the small details used in the above code and some problems overcome are all completed through the usage of some other parameters in CreteProcess. You can see more about the usefulness of the parameters of Flags, maybe there are some better implementation methods.