a preliminary understanding of the fuzzy tool winafl

Posted by punzalan at 2020-02-29

Author: xd0ol1 (know Chuangyu 404 laboratory)

0 Introduction

In the first two sections of this article, we will briefly discuss the basic concept of fuzzy and dynamorio, a stake inserting framework used in winafl. Then we will show you the fuzzy tool for Windows platform from the perspective of source code and tool use.

1 Fuzzing 101

As far as fuzzing is concerned, it is an automatic or semi-automatic software testing technology that takes invalid, unknown and random data as the input of the target program. Nowadays, it is mostly used in vulnerability mining. Its most basic implementation scheme is shown in the figure below. Although it is not complicated, it is not easy in practical application:

According to the different ways of obtaining input use cases, it can be generally divided into three types: dumb fuzzy based on mutation, smart fuzzy based on generation and fuzzy based on evolutionary algorithm. The first two types are relatively mature, while the third one will still be the main direction of future development. Among them, fuzzy based on evolutionary algorithm will constantly improve test cases with the help of feedback from the target program, which requires relevant evaluation strategies to be given at design time. The most common is to take the code coverage rate at program run time as the measurement standard.

Of course, the design of fuzzy should not be limited to the prototype proof of relevant theories, the key is to be proved by practice to be truly effective.

2 dynamorio dynamic binary pile insertion

Let's take a look at the post insertion. DBI (dynamic binary instrumentation) is a technology to realize dynamic analysis of binary programs by injecting probe code, which will be executed as normal instructions. Common frameworks of this kind include pin, Valgrind, dynamorio, etc. Here we want to focus on dynamorio.

With dynamorio, we can monitor the running code of the program, and it also allows us to modify the running code. To be exact, dynamorio is like a process virtual machine. All the code of the monitored program is transferred to the buffer space on it to simulate execution. The specific architecture is as follows:

Among them, basic block is an important concept. Imagine that if all instructions in the monitoring process are divided by the control transfer class instructions, they will be divided into many blocks. These blocks start with an instruction, but all end with the control transfer class instructions, as shown in the following figure:

These instruction blocks are the basic block concepts defined in dynamorio, that is, the basic unit of operation. Dynamorio will simulate running instructions in one basic block at a time. When these instructions are completed, they will be switched to another basic block through context to run, and so on until the monitored process finishes running.

In addition, the framework also provides us with a wealth of functional programming interfaces, which can be very convenient for plug-in (client) development, mainly depends on various event callback processing, and good instruction filtering is also helpful to improve performance.

3 WinAFL Fuzzer

Next, let's take a look at the key point of this article, that is, winafl, a specific fuzzer. The content of this section is divided into three parts. The first part is an overview, and then we will analyze the key source code of this tool. Finally, we will carry out a real fuzzing with the help of the constructed vulnerability program.

3.1 overview

For fuzzy, AFL (American Fuzzy LOP) must be familiar to everyone, but because of its code design, it does not support the windows platform, and the winafl project is the migration of this fuzzy under the windows platform. AFL implements its functions by means of compile time instrumentation and genetic algorithm. Due to the support of the platform, this compile time instrumentation is replaced by dynamorio dynamic instrumentation in winafl. In addition, related functions are rewritten based on Windows API.

When using winafl for fuzzing, you need to specify the target program and the corresponding input test case file, and there must be such a target function for inserting piles. During the execution of this function, you need to open and close the input file and analyze the file, so that after inserting piles, you can ensure the circular execution file fuzzing of the target program and avoid every fuzzing The operations recreate the new target process. At the same time, the input file of fuzzy will be transformed according to the corresponding algorithm, and whether it is used for subsequent fuzzy operation will be judged according to the coverage of the target module.

3.2 key source code analysis

The winafl version we analyzed here is 1.08, which can be obtained from GitHub. Among them, the AFL docs directory contains relevant documents about design principles, technical details, etc., while the bin directory contains compiled relevant programs, while the testcases directory is a variety of test case files, and most of the rest are source files. Generally speaking, there are not many files related to the source code, and the code amount is about 10K +. The key is afl-fuzzy. C and winafl. C, which is also our main analysis. In addition, some auxiliary tools are included in the source code, such as afl-showmap. C, which displays the trace bitmap information, and, which is used to minimize the set of test case files. However, AFL Tmin, which is used to minimize the set of test case files, has not been ported to this platform. Of course, for more design related instructions, please refer to the technical details.txt file.

afl-fuzz.c winafl.c technical_details.txt

3.2.1 in the fuzzy module, let's first look at afl-fuzzy. C. This part of the code realizes the function of fuzzy. For the input test files used in fuzzy, the program will use the structure queue entry chain list for maintenance. We can find the corresponding queue folder in the output result directory, as shown below is the code fragment for adding test cases:


The input file is fuzzed by the fuzzy one function. This process covers many stages, including bit flipping, arithmetic operation, integer insertion and other uncertain fuzzy strategies. Moreover, there is no special relationship between the mutation mode adopted in fuzzy and the program state. On the surface, this step is completely blind:

For each of the above-mentioned fuzzy strategies, the program first needs to make corresponding modifications to the test case, then runs the target program and processes the fuzzy results obtained:

Because the program adopts the idea of genetic algorithm, it will evaluate the execution results of each fuzzy strategy, that is, according to the code coverage of the target program, it will decide whether to add the current test cases to the fuzzy list

Of course, necessary correction may be needed before fuzzing the test file:

In addition, in the process of fuzzing, the status information of related results will be constantly updated. The display of this interface is implemented by the show ﹣ stats function:

3.2.2 continue to look at winafl. C below the pile insertion module. This file corresponds to the dynamorio plug-in code written. It has two functions:


First, the program initializes and registers various event callback functions, the most important of which are basic block handling events and module loading events:

In the corresponding module loading event callback function, if the current module is the fuzzy target module, the corresponding target function will be inserted into the pile:

That is to say, before the execution of the target function, the current register environment is recorded by the pre ﹣ fuzzy ﹣ handler call, and after the execution of the target function, the register environment is restored by the post ﹣ fuzzy ﹣ handler call, so as to realize the continuous cycle of the target function to be fuzzy:

pre_fuzz_handler post_fuzz_handler

In addition, the other key problem is the processing of bitmap files. There are two modes for the coverage calculation of bitmap files, i.e. basic block coverage mode and edge coverage mode. At fuzzing In the process, a 64 kb bitmap file will be maintained to record the coverage and hit times. In the boundary coverage mode, each byte represents a specific source address and target address pairing. This mode is more helpful to visualize the execution process of the program, because vulnerabilities are often caused by unknown or abnormal execution state conversion, rather than simple basic blocks Coverage. The corresponding event functions are instrument [BB] coverage and instrument [edge] coverage, that is, the registered basic block handling callback function. The update of bitmap file is realized by inserting new instructions. For the boundary coverage, the code is as follows, and the corresponding basic block coverage is similar:

instrument_bb_coverage instrument_edge_coverage
3.3 use of winafl

In the end, we will carry out a real fuzzing. The target program used is based on the modified gdiplus.cpp source code. A crash is introduced manually, and the code is as follows:

First of all, we need to determine the objective function of fuzzy, that is, to set the parameters corresponding to - target uuoffset or - target uumethod. In this example, the main function is a qualified target function. To use - target ﹣ offset, you can simply view the offset of this function through IDA. In this example, 0x1090:

-target_offset -target_method -target_offset 0x1090

If there is a symbol file, you can directly set the parameter of - target method to main. For the parameter of - coverage ﹣ module, we can execute the following command to obtain it. Note that the directory of dynamorio needs to be set according to the actual situation. In the log file obtained, the modules loaded during the execution of the target program are given. At the same time, the running result must be "everything appears to be running normally.":

-target_method -coverage_module

Then, we can enter the following command for fuzzing, where "@ @" means that the test case file to be fuzzed is in the directory of in:

However, the dynamorio plug-in winafl.dll does not appear in the above command parameters. In fact, a new subprocess is created after the command is executed, as shown in the following figure:

We can get the command parameters of drrun.exe as follows:

If there is no problem, we will see the following fuzzy interface. For winafl compilation and other parameter settings, please refer to the readme file:


The results of each stage in fuzzing will be saved in the out directory of the - O option setting. The crash or hang directory holds the test case file that causes the bug. Whether the target program has exploitable vulnerabilities needs further confirmation:


4 Conclusion

This paper introduces the fuzzy tool winafl in general, but there are many aspects to be considered in practical application. In addition, the author is still a beginner. I hope you can correct the mistakes. Welcome to communicate with me: P

5 reference

[1] A fork of AFL for fuzzing Windows binaries[2] Dynamic Instrumentation Tool Platform[3] American fuzzy lop[4] Real World Fuzzing[5] Code Coverage[6] Effective file format fuzzing

This article was published by seebug paper. If you need to reprint it, please indicate the source. Address: