firmware reverse analysis of the method of mining industrial control loopholes

Posted by deaguero at 2020-04-15

There are many methods of mining industrial control vulnerabilities, including: fuzzy test vulnerability mining method based on industrial control protocol, vulnerability mining method based on firmware reverse analysis, vulnerability mining method based on ActiveX control of industrial control software, vulnerability mining method based on VxWorks operating system, etc.

This paper will focus on the firmware reverse analysis method, and combined with the actual case to explain the key problems encountered in the process of reverse analysis and solutions.

1. Firmware reverse analysis method

Firmware reverse analysis method is a kind of technical means to find the possible loopholes and backdoor in the embedded system by analyzing the firmware file and analyzing the calling relationship and code content of each code module in the firmware without actually running the embedded system.

In the process of firmware reverse analysis, it will involve firmware identification and decompression, firmware static analysis and other technologies.

1.1 identification and decompression of firmware

To identify and decompress the firmware, we can borrow some mature tool software, such as binwalk, bat (binary analysis toolkit), etc. Binwalk and bat are both popular firmware image extraction and analysis tools. Binwalk is published with MIT license and bat is published with GPL license. The comparison table of the supported firmware image decompression formats is as follows:

Table 1. Comparison of firmware decompression formats supported by binwalk and bat

For common embedded device firmware, binwalk or bat can be used to extract firmware files. For firmware that cannot be decompressed automatically, you can try the following methods to analyze it:

1) Use the file analysis tool to get the basic data type of the firmware image file.

2) Use the string printing tool to extract the clear code fields contained in the file to find out whether there is information about the bootloader and the operating system kernel.

3) Using a hex dump tool, such as hexdump, to analyze consecutive fill bytes placed to align firmware file space segments, it is possible that the file system ID follows.

4) It is possible for the file system to use non-standard signature. If the suspicious signature field is found, it can be replaced with standard signature, and then it can be identified by the firmware decompression tool.

1.2 static analysis of firmware

The analysis after the firmware decompression mainly focuses on the static analysis of the common vulnerability entry, including: password, default open service, port, configuration file, etc.

The analysis method is as follows:

1) Try to extract whether there is a hard coded password in the clear code field contained in the file.

2) Explore the association of firmware, including the analysis of firmware author, library usage, directory structure, profile keywords, custom default password and other information.

3) For disassembly analysis of binary executable files, we can borrow some mature tool software, such as IDA pro, capstone, etc. Disassemble and analyze the login module of specific embedded system (such as VxWorks) to obtain the hash algorithm of login password.

Ida Pro is the most widely used static disassembly tool, which supports the reverse analysis of a large number of CPU architectures, including x86, MIPs, PowerPC and arm.

Capstone is a disassembly framework that supports multiple platforms and can run on windows, Mac OS X, Linux, FreeBSD, OpenBSD, and Solaris. Capstone can disassemble applications under arm, arm64 (armv8), MIPs, PPC and x86 architectures.

4) If you find a file that contains a password hash, consider using tools such as John the Ripper or hash suite for brute force cracking. The former version supports GPU acceleration (CUDA and OpenCL are supported). The use of brute force tools can use the keywords extracted in the above steps to significantly speed up the operation efficiency.

2. Firmware reverse analysis case

In this paper, the firmware of Schneider noe 771 will be analyzed in reverse. Noe 771 is the Ethernet module of Schneider quantum PLC. Quantum PLC is Schneider's high-end PLC, which is applied in China's core energy dispatching network system, such as: SCADA system of regional sub section of west east gas transmission.

In the process of analysis, we will focus on the key technologies of firmware identification and decompression, firmware loading address extraction and function name repair in firmware disassembly code.

2.1. Reverse analysis of Schneider noe 771 firmware

2.1.1 identification and decompression of firmware

1) Acquisition of firmware upgrade package

We can download the firmware upgrade package from Schneider's official website and extract the firmware file from the upgrade package. The firmware file name for noe 771 is noe77101.bin.

2) Identification and decompression of firmware

First, use binwalk to confirm the compression type of the file, and it is found to be zlib type, as shown in Figure 1.

Figure 1. Firmware compression type analysis

Second, extract the zlib compressed file using binwalk, as shown in Figure 2. The extracted file 385 is stored in the directory "noe77101. Bin. Extracted" and named after the starting location of the file in the firmware upgrade package.

Figure 2. Extract the noe77101.bin file

Then, binwalk is used to analyze 385 files, and some key information such as path name, operating system version and symbol table address are found. The operating system version of the firmware is VxWorks 2.5, which can be combined with VxWorks source code for reverse analysis. The address of the symbol table of the firmware is shown in Figure 4. The symbol table can be used to repair the function name in the disassembly code. See section 2.1.3 for details.

Figure 3.385 file decompression

Figure 4. The operating system version and symbol table address are found after decompression

2.1.2 firmware loading address extraction

Because the firmware of embedded system needs to be loaded into a specific location in memory to run, this specific location is called the firmware load address.

The function call address of the embedded system firmware is the memory location calculated based on the firmware load address, not the offset location in the firmware.

Therefore, in order to make the disassembly tool software (such as IDA Pro) analyze the function call relationship correctly, we need to analyze the firmware load address, otherwise all the function call relationships will be wrong.

For the firmware file encapsulated with elf, there is a specific data bit in the header of the ELF file to record the loading address of the firmware. Therefore, we can read the header of the ELF file directly to obtain the loading address of the firmware directly.

If the firmware does not use any encapsulation, the firmware code needs to be reversed to analyze the firmware loading address. This method is relatively complex, and it is different for different embedded systems and CPU architectures.

For the firmware of noe771, we will analyze the code call in the firmware header to roughly guess the loading address of the firmware.

1) Get CPU architecture and choose the right disassembly engine

First, use the binwalk - a command to obtain the CPU architecture and other information of the target firmware, which helps to select the correct disassembly engine. As shown in Figure 5, the CPU architecture of the target firmware is PowerPC big endian.

Figure 5. Obtaining the CPU architecture of the firmware

Secondly, use IDA pro to load the disassembly engine of PowerPC big endian architecture for analysis.  

Figure 6. Selecting the disassembly engine for IDA Pro

2) Analyze the loading address of firmware and disassemble it correctly

When the firmware loading address is not modified, IDA Pro only analyzes a few functions, as shown in Figure 7.  

Figure 7. Function of IDA Pro analysis without modifying load address

After analyzing the code on the firmware header (which is often time-consuming), it can be found that there is a very suspicious function call at 0x09f8. The function call address is an offset of 0x339ab8 + an absolute address of 0x10000. It is quite possible that 0x10000 is the firmware load address we need.

Figure 8. Firmware load address analysis and extraction

Now we need to verify that 0x10000 is our real firmware load address. Use IDA Pro again to load the firmware file and configure it according to the following figure. After configuration, IDA Pro can analyze the function call relationship of firmware normally.

Figure 9. Reconfigure firmware load address

2.1.3. Function name repair in firmware disassembly code

In the previous section, IDA Pro successfully analyzed the call relationship of the function, but it was unable to automatically identify the function name, which hindered our further analysis.

Therefore, we need to see if the firmware contains a symbol table. If the symbol table is included, you can use the contents of the symbol table to repair the function name displayed in IDA pro.

1) Get the location of symbol table in firmware

The symbol table of VxWorks System contains the corresponding relationship between function and function name, so our first step is to find the location of symbol table in firmware. When binwalk was used to analyze the firmware, it was found that the symbol table location in the firmware is 0x301e74.

Figure 10. Obtaining the location of symbol table in firmware

2) Determine the start and end address of the symbol table

After getting the location of the symbol table in the firmware, we can use the hex editor to check the firmware to confirm whether the address analyzed by binwalk is correct.

The byte sorting of VxWorks series has a unique format, with 16 bytes as a group of data, the first 4 bytes are the memory address of the function name, the last 4 bytes are the memory location of the function, and then end with another 4 characteristic byte data + 4 bytes 0x00.

By looking at the address location analyzed by binwalk, we can see that this address is indeed a symbol table, 0x27655c is the memory address of the function name, and 0x1ff058 is the memory location of the function.

Because the symbol table has its own characteristics, it can quickly lock the start and end address of the symbol table by traversal. The symbol table of the tested firmware starts at 0x301e64 + 0x10000 and ends at 0x3293a4 + 0x10000.

Figure 11. Determining the start and end addresses of the symbol table

3) Script plug-in repair function name

After getting the location of the symbol table, we need to use IDA pro's API to fix the function name. Here we will use the following Python script.

Figure 12. Script plug-in repair function name

4) Run the script plug-in to fix the function name in the disassembly code

Run the python script in IDA pro, as shown in the following figure.

Figure 13. Running script plug-in to fix function name

After the script is executed, the function name in IDA Pro is as shown in the following figure.

Figure 14. IDA Pro disassembly interface after function name repair

2.2. Analysis of Schneider noe 771 back door account

After the firmware reverse analysis, you can view the account information added during initialization by viewing the firmware service loading process.

Looking at the usrappinit function, you can see a large number of loginuseraddd calls, as shown in Figure 15. Multiple back door accounts can be found at the same time, as shown in Figure 16.

Figure 15. Multiple loginuseradd calls found

Figure 16. Multiple back door accounts found

3, summary

The method of vulnerability mining based on firmware reverse analysis can find the deep software backdoor vulnerability, which is a very practical method of vulnerability mining. In the field of industrial control, there are a large number of embedded systems. Most of these systems are updated by firmware upgrade, most of them are only compressed and not encrypted, and most of them are VxWorks systems. Therefore, the firmware reverse analysis method discussed in this paper is universal. It is suggested that the interested parties should try more to effectively discover various security vulnerabilities.

Note: the copyright of this article is owned by winu special agent control security. Please keep the source and original link for reprint.

Agent control security