IMCAFS

Home

safe customer, safe information platform

Posted by barello at 2020-04-02
all

1、 Foreword

Debug blocker is a tedious anti debugging technology, which is often used in some PE protectors,

In the CTF inverse problem, it often appears. This paper introduces the principle of this anti debugging technology with an example program of debug blocker, and then takes a CTF reverse problem as an example, uses two methods to reverse.

An example of this article can be downloaded here: https://github.com/b1ngoo/crack

2、 Principle

(1) Definition and characteristics

Debug blocker technology is a technology that processes run themselves or other executable files in debug mode. In general programs using this technology, the parent process acts as a debugger, and creates a child process in debugging mode by calling CreateProcess function. At this time, the child process acts as the debugged, and the debugger and the debugged execute different codes. See the following sample program (the sample program comes from the source code attached to the core principle of Reverse Engineering):

In the above example, the parent process prints out the parent process in the console, and then creates a child process to call the messagebox() API pop-up window.

Because in windows, a process cannot be debugged by multiple debuggers, if the key algorithm code runs in the debugged subprocess, because the relationship between the subprocess and the parent process constitutes the relationship between the debugger and the debugger, it naturally forms the anti debugging effect, which makes it difficult for us to debug dynamically.

It is worth noting that in the debugger debugged relationship, all exceptions that occur in the debugged process are handled by the debugger. If the debugged subprocess intentionally triggers an exception, the exception will be delivered to the debugger for processing. Before the exception is processed, the subprocess cannot continue to run. On the other hand, if the debugging process is terminated, the debugged process will also be terminated. The above features make debug blocker a kind of anti debugging technology.

Next, we combine the source code to analyze the specific process of the technical role.

(2) Code analysis

The source code of the sample program comes from chapter 57 of the core principles of reverse engineering.

First, take a look at the code as a whole:

void DoParentProcess(); // 父进程函数 void DoChildProcess(); // 子进程函数 void _tmain(int argc, TCHAR *argv[]) { HANDLE hMutex = NULL; if( !(hMutex = CreateMutex(NULL, FALSE, DEF_MUTEX_NAME)) ) { printf("CreateMutex() failed! [%d]\n", GetLastError()); // 创建mutex失败 return; } // 检查mutex,判断子进程和父进程 if( ERROR_ALREADY_EXISTS != GetLastError() ) DoParentProcess(); else DoChildProcess(); }

The whole code structure is very simple (the program is just an example), including the main function, the parent process function doparentprocess() and the child process function dochildprocess(). In the main function, judge whether the process appears as a child process or as a parent process by creating a mutex with the same name: when the program runs first as a parent process, create a def? Mutex? Name mutex, and GetLastError is 0:

And error "already" exists is defined in the winerror. H header file as:

If it is a child process, since the parent process has created the mutex object, it will report last error = 183, and enter the function branch of the child process.

Next, enter doparentprocess(), and the preferred way is to create a debugging process through the createprocess() API:

// 创建调试进程 GetModuleFileName( GetModuleHandle(NULL), szPath, MAX_PATH); if( !CreateProcess( NULL, szPath, NULL, NULL, FALSE, DEBUG_PROCESS | DEBUG_ONLY_THIS_PROCESS, // 调试进程参数 NULL, NULL, &si, &pi) ) { printf("CreateProcess() failed! [%d]\n", GetLastError()); return; } printf("Parent Process\n");

If there is no problem with the creation process, print the string and enter a dead loop. In each loop, wait for the exception of the subprocess with waitfordebugevent() API at the beginning. In this position, judge the type of debug event and observe the change of dwdebugeventcode

You can see that the first time dwdebugeventcode = 3, you can know that this is a create process debug event:

This is the initial event that occurs when the debugged process runs. Now let's look at the subprocess functions:

void DoChildProcess() { // 需要在Ollydbg中修改 __asm { nop nop } MessageBox(NULL, L"ChildProcess", L"TEST", MB_OK); }

In fact, in order to achieve the purpose of anti debugging, the subprocess function needs to be modified in OllyDbg after compiling and connecting to the executable file. There are three areas that need to be modified:

In the process of running a subprocess, we only care about the subprocess's exception "debug" event, and the exception type is "exception" illegal "instruction. This error is due to our intentional writing of lea at the beginning of the subprocess function Eax, eax illegal instruction. The reason for this instruction exception is that the instruction format is incorrect. The second operand needs to be memory.

In the above picture, the first lea eax, eax, appears at address 0x0040103f, which is hard coded, so the program does not support base address relocation. You can look at the base address relocation directory in the FFI tool:

If an illegal instruction exception occurs in the child process at 0x0040103f, the exception will be submitted to the parent process for processing. The processing code is as follows:

if( dwExcpAddr == EXCP_ADDR_1 ) // 第一次非法指令异常 { // decoding ReadProcessMemory( // 读子进程内存空间 pi.hProcess, (LPCVOID)(dwExcpAddr + 2), pBuf, DECODING_SIZE, NULL); for(DWORD i = 0; i < DECODING_SIZE; i++) // 进行异或解码 pBuf[i] ^= DECODING_KEY; WriteProcessMemory( // 写回子进程内存空间 pi.hProcess, (LPVOID)(dwExcpAddr + 2), pBuf, DECODING_SIZE, NULL); // 修改EIP ctx.ContextFlags = CONTEXT_FULL; GetThreadContext(pi.hThread, &ctx); ctx.Eip += 2; SetThreadContext(pi.hThread, &ctx); }

The whole code logic is very clear. It is a process to read the sub process memory space, then decode and write back, and finally adjust the EIP. After handling the exception, use continuedebugevent to let the subprocess execute:

ContinueDebugEvent(de.dwProcessId, de.dwThreadId, DBG_CONTINUE); // 异常处理完毕,接着执行子进程

After decoding, the subprocess will execute. At this time, the second illegal instruction exception will occur, which is located at 0x00401048. At this time, the processing method is relatively simple, and the direct patch is OK.

To sum up, the debug blocker anti debugging technology can be seen as a dynamic patch process. If the subprocess functions are viewed directly and statically with IDA, the code is encrypted because there is no patch. If you try to debug a subprocess, the subprocess cannot be attached by the debugger because of the debugger debugged relationship.

However, there is still a solution. In the next part, I will use two methods to reverse the CTF problem of debug blocker.

3、 Example

This problem comes from a reverse problem of CFF 2016: software password cracking-2. Now in a big

It can be done on the master's OJ (Jarvis OJ). Here is the topic program and the program I typed the patch: https://github.com/b1ngoo/crack the program after patch can be decompiled in IDA 7.0.

There are a lot of writeup on the Internet, but they basically understand the program flow, and then manually type the patch to see the subprocess. This method is very convenient to deal with the program with simple key algorithm. If it's a little difficult, it's not good to dynamically debug the subprocess code. Here are two methods.

(1) Method 1

This method enables us to directly debug subprocesses in the following order:

First, OllyDbg loads the subject program and tries to use the original OD for debugging, because my love od or other plug-ins may cause some exceptions to be ignored (such as the strong od plug-in). Look at the subprocess's hProcess from the breakpoint at the location of 0x 004013bf, input, and then disconnect:

Note down the value hProcess = 0x2c, execute, and then place the breakpoint at 0x004013fc, record ThreadID = 0xf6c and ProcessId = 0x224 (the values are different in different environments):

At this time, the code of the subprocess has been fixed by the parent process. If we continue to execute the continuedebugevent() function, the subprocess will continue to execute normally. Now is our chance to debug this subprocess.

Modify the process space of the parent process through OD, and execute the writeprocessmemory() API again to add an infinite loop of the child process as follows:

Then the ContinueDebugEvent function is called to let the child process run again.

In order for od to attach a child process for debugging, it is necessary to release the relationship between the debugger and the debugged of the parent process and the child process. We can use the debugactiveprocessstop() API to detach:

By executing these instructions in turn, you can release the debugger debugged relationship between the parent process and the child process, and then use od to attach the child process for debugging. If you encounter a system breakpoint and run F9 directly, and then F12 pauses, OD is temporarily in the current running code location:

Then according to the results of static analysis, write back the two bytes of the previous infinite loop:

Then you can debug happily. See the static part method 2 below for the specific algorithm description.

(2) Method 2

This method is to look at IDA, understand the sub_functionwith od dynamic debugging, and then you can know the logic of this program is as follows:

Directly execute the program, execute the flow to the first branch by inputting the parameter, enter the sub_function, in this function, run itself through the CreateProcess function, pass the input to the program through the command line parameter, and the execution flow enters the second branch. Then the program is dynamically patched through writeprocessmemory to change the program to its real appearance and ensure its normal execution. Then in the second branch, the input is XOR byte by byte with a string, and each bit is added with 1. The result of the operation is returned to the parent process through the OutputDebugString function. The parent process reads the return value into buffer through readprocessmemory, and finally compares it with the hard coded value of the program. If the comparison is successful, 1 is returned, otherwise - 1 is returned.

So the solution is to manually repair the sub process branches. After the repair, the program can be decompiled. The algorithm is very simple:

Then write the reverse algorithm according to the forward algorithm. I use Python to imitate the forward process and give the reverse code:

# -*- coding:utf-8 -*- # reverse a = [0x65 ,0x6C, 0x63, 0x6F, 0x6D, 0x65, 0x20, 0x74, 0x6F, 0x20, 0x43, 0x46, 0x46, 0x20, 0x74, 0x65, 0x73, 0x74, 0x21] b = [0x25,0x5c,0x5c,0x2b,0x2f,0x5d,0x19,0x36,0x2c,0x64,0x72,0x76,0x80,0x66,0x4e,0x52] # 被比较的 flag = "" for i in range(16): b[i] = b[i] - 1 flag += chr(b[i]^a[i]) print flag ''' encrypt a = "AAAAAAAAAAAAAAAA" a = [ord(i) for i in a] print a b = [] c = [0x65 ,0x6C, 0x63, 0x6F, 0x6D, 0x65, 0x20, 0x74, 0x6F, 0x20, 0x43, 0x46, 0x46, 0x20, 0x74, 0x65, 0x73, 0x74, 0x21] for i in range(len(a)): t = a[i]^c[i] t = t + 1 b.append(hex(t)) print b '''

Take a look, and the results are correct: