analysis of the source of intrusion by simulating the network traffic of enterprises

Posted by lipsius at 2020-04-15

Author of this paper: 3S ﹣ nwgeek

Details of original contribution: great reward | Hetian original contribution waiting for you!

Background introduction: the company detected abnormal traffic, and many employees received phishing emails, and found a loophole in an email system. The hacker obtained a lot of information through the loophole, so the network traffic of the day was extracted for analysis, with a number of 5.3g, about 13 million packets.

Analysis objectives:

1. What is the forged email address used by the attacker

2. What is the IP address of the attacker sending phishing email

3. What is the malicious URL in the email

4. What is the successful phishing email sent by the attacker

5. What is the vulnerability of the server

6. What attacks have hackers made on the system

Preliminary preparation:

Analysis process Outline:

1. Flow cleaning

2. Flow stratification

3. Intrusion traffic tracking

4. Vulnerability traceability analysis

0x01: flow cleaning

Among the data packets decomposed by equal amount, there are about 660000 packets in each subcontract through sampling

Take a look at packet protocol layering

See the SMTP protocol (mail service protocol) in the protocol layer. Please refer to Appendix 4 for details of SMTP protocol

According to the background description, the company detected abnormal traffic and many employees received phishing email. It is preliminarily speculated that the attacker caused abnormal traffic to the website attack and sent phishing email using the email service protocol.

So let's first look at the packets of the SMTP protocol and the HTTP protocol

Carry out flow cleaning

Wireshark's own tools:

Mergecap (merge traffic package)

Tshark (flow packet filtering)

Editcap (packet decomposition)

(for usage, please refer to appendix: flow packet decomposition and merging (Wireshark's own))

Cleaning ideas (merge flow packets → filter SMTP and HTTP protocols → get flow packets)

The first step is to merge all traffic packets:

Mergecap - w total.pcap a.pcap b.pcap / / (file names are too long and abbreviated to a and b)

Get the total.pcap, and clean it through the traffic. The goal is to filter out the traffic of SMTP and HTTP

Cleaning: tshark - R total.pcap - y pop|smtp|http - w result.pcap (total.pcap is the data package merged in the previous step)

(for filter parameters, please refer to appendix: details of parameters used by tshark)

It is concluded that: result.pcap has about 20000 packets, and the following processes are based on this packet

0x02: traffic stratification

Next, let's look at the SMTP packets. The Wireshark filter condition is: SMTP

Select statistics → dialogue to view the flow trend

Two IP addresses were found. Check their IP traffic separately

1) . the filter condition is SMTP and IP. Addr = =

It is found that a mailbox is sent to multiple messages with the same mailbox,

Email content: a new system of it is online, please click on the link below, to assist to test the running state of system, thank you! Http://

The email content has an inductive click link and a URL address of, which is suspicious

Try to filter by SMTP and IP. Addr = =

No exception was found from the content of the IP message

Compared with, it is suspicious and needs further confirmation

Then go back to the packets with the filter condition of SMTP, traverse all packets, and count the number of times that the mailbox appears the most (see the figure first under the appendix explanation)

It is found that the number of times of it @ mailbox is abnormal by traversing all STMP emails, which is 10 times more than that of other mailboxes. Based on the above, it is speculated that the IP address of suspicious email is consistent with the IP address of, so it is inferred that:

Attacker IP is

The forged email address used by the attacker is [email protected]

The suspicious phishing link is:

It was found that only was using email server

And the login email is

0x03: intrusion traffic tracking

According to the IP address of the attacker

To re filter the packets, the condition is that the source IP is

Try to search for Trojans and inject or XSS keywords

ip.addr == && http matches "upload|alert|script|eval|select"

Finally, if the condition is post request, the content includes Eval to filter out the Trojan traffic

0x04: vulnerability traceability analysis

According to the URL path and page name, the vulnerability of launching the website is arbitrary file upload vulnerability

(the post package has upload keyword → push test upload page)

(the post packet access file is hack.php → presumably webshell)

Overall speculation: webshell is uploaded on the upload page

Looking at the source code of the upload, we found that there is no security protection and any file upload vulnerability.

Up.php does not have any filtering such as file suffix

Verify: arbitrary file upload vulnerability

After the attacker uploads the Trojan, execute the command

The command is Base64 encrypted. The related commands are as follows




After decryption, it is whoamI, ifconfig, cat / etc / passwd, etc

But this analysis goal has been achieved

1. What is the forged email address used by the attacker

[email protected]

2. What is the IP address of the attacker sending phishing email

3. What is the malicious URL in the email

4. What is the successful phishing email sent by the attacker

Hi all

A new system of IT is online, please click on the link below, to assist to test the running state of system,thank you!

5. What is the vulnerability of the server

Arbitrary file upload vulnerability

6. What attacks have hackers made on the system

Server remote code execution; apt attack (phishing) against company

The elder brother who has participated in triathlon may be familiar with these. The data package comes from triathlon competition. My younger brother wants more people to realize this traffic analysis. It seems that there are fewer articles about using traffic analysis on the Internet, and even fewer using Python to analyze traffic packages. Here's a summary.

As we try to make you understand the principle of the analysis process simply, Wireshark is used to analyze the process for you

In fact, all the analysis processes in the actual process can be analyzed with scripts written by python

Pyshark module of Python

If there is any mistake in the analysis of the above article, please give us some advice. Thank you

Attach a simple pyshark entry code

(python2.7) import sys import pyshark from pyshark.capture.capture import Capture reload(sys) sys.setdefaultencoding('utf-8') (环境初始化) def main(): pass path='C:\Users\Desktop\result.pcap' #读取pcap文件路径 cap = pyshark.FileCapture(path,display_filter='http') #定义对象并筛选过滤条件为http for p in cap: #遍历所有数据包 try: print p.http.file_data #提取数据包中的http内容 except Exception as e: #报错显示 print e if __name__ == '__main__' : main() pass

The original intention is to let readers know that in addition to manually opening pcap to read and analyze one by one, it can also read and analyze automatically with Python.

The above is just a way to open a pcap file to read packets. Direct code posting can only tell you that you may only learn one knowledge point, and that you can learn unlimited knowledge by telling you how to explore

The key idea used in this traffic analysis is just to match the mailbox with regular, and then make a statistics. It's relatively simple, and the key code is added below.

The key codes are as follows

patten = re.compile(r'\b[a-zA-Z0-9._%+-][email protected][a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b') #正则爬邮箱 path='C:\Users\Desktop\\result.pcap '#文件路径 cap = pyshark.FileCapture(path,display_filter='imf') #过滤文件smtp邮件内容imf for p in cap: #遍历数据包 p=str(p) #转str格式 try: print p # a=a+re.findall(re.compile(patten),p) #数组中加入邮箱方便统计 except Exception as e: print e print a

Refer to appendix for pyshark parameter conditions


Appendix 1: packet decomposition and merging (Wireshark's own)

editcap.exe -c 100 D:\dump.pcap D:\test.pcap

In Wireshark, SIP signaling is filtered out by filter, but in multiple files, megecap can merge multiple pcap files into one file.

Usage: mergecap - W < output file > < source file 1 > < source file 2 >... Example: mergecap - w compare.pcap a.pcap b.pcap

Tshark -r target.pcap -Y pop -w pop.pcap

Appendix 2: details of the parameters used by tshark

Official document of tshark

Capture interface: - I: - I < interface > specifies the capture interface, which is the first non local loop interface by default;

- F: - f < capture filter > set the packet capture filter expression and follow the Libpcap filter syntax. This filter is used in the process of packet capture. It is not used for analyzing local files.

- s: - s < snaplen > set the snapshot length to read the complete data package. Because there is a limit of 65535 in the network transmission, a value of 0 represents the snapshot length of 65535, which is also the default value; - P: works in a non mixed mode, that is, only the traffic related to the local machine is concerned. - B: - B < buffer size > sets the size of the buffer, which is only valid for windows. The default value is 2m; - Y: - y < link type > sets the packet capturing data link layer protocol. If it is not set, the default value is the first protocol found by - L. the local area network is generally en10mb and so on; - D: prints the list of interfaces and exits; - L lists the data link layer protocols supported by the machine for the - Y parameter.

Capture stop options: - C: - C < packet count > end after capturing n packets, unlimited by default; - A: - a < autostop cond. >... Duration: num, stop capturing after num seconds; filesize: num, stop capturing after numkb; Files: num, stop capturing after num files are captured;

Capture output options: - B < ring buffer opt. >... The file name of ring buffer is determined by the - W parameter, - b parameter is written in the form of test: value; duration: num - switch to the next file after num seconds; filesize: num - switch to the next file after num KB; Files: num - to form a ring buffer, after the num file reaches;

RPCAP options: remote packet capture protocol, remote packet capture protocol for packet capture; - A: - a < user >: < password >, RPCAP password for authentication;

Input file: - R: - R < infile > set read local file

Processing options: - 2: perform twice analysis - R: - R < read Filter >, the read filter of the package can be viewed in Wireshark's filter syntax; in Wireshark's view - > filter view, click expression in this column, and the support for all protocols will be listed. - Y: - y < display filter >, using the syntax of the read filter, you can replace the - R option in a single analysis; - N: disable all address name resolution (allow all by default) - N: enable address name resolution of a certain layer. "M" represents the MAC layer, "n" represents the network layer, "t" represents the transport layer, "C" represents the current asynchronous DNS lookup. If the - N and - n parameters exist at the same time, - n is ignored. If the - N and - n parameters are not written, all address name resolution is turned on by default.

- D: unpack and output the specified data according to the relevant protocol. If you want to unpack the traffic of TCP 8888 port according to HTTP, you should write "- D TCP. Port = = 8888, HTTP"; tshark - D. you can list all the supported valid selectors.   

Output options: - W: - W < outfile | - > set the output file of raw data. If this parameter is not set, tshark will output the decoding result to stdout. "- W -" means to output raw to stdout. If you want to output the decoding result to a file, use the redirect ">" instead of the - W parameter. - F: - f < output file type >, set the output file format. The default is. Pcapng. Use tshark - F to list all supported output file types. - V: increase detail output; - O: - O < protocols >, only the protocol details specified by this option are displayed. - P: print the package profile even if the decoding result is written to the file; - s: - s < separator > line separator - X: set in the decoding output result, after each packet, display the specific data in the form of hex dump. - t: - t pdml||||||||||||||||||||||||||||||||||||||; - E: - e < fieldoption > = < value > if the - t fields option is specified, use - e to set some attributes, such a s header = y| n separator = / t| / s| < char > occurrence = f| l| a aggregator =, | / s| < char > - t: - t a| a d| d| dd| e| r| u| UD to set the time format of decoding results. "A D" represents the absolute time with date, "a" represents the absolute time without date, "R" represents the relative time from the first package to the present, "d" represents the incremental time (delta) between two adjacent packages.

- U: s| HMS format output seconds; - L: after each package is output, the flush standard output - Q: is used in combination with the - Z option for statistical analysis; - X: < key >: < value > extension, lua_script, read_format, see man pages for details; - Z: Statistical options, specific reference documents; tshark - Z help, you can list the statistical methods supported by the - Z option.

Other options: - H: display command line help; - V: display version information of tshark;

Appendix 3: pyshark filter parameters

a= pyshark.FileCapture(path,display_filter='http')

A is to define pcap file object

The main filter parameters used this time are

a. Highest layer protocol content

a. Http.request method

a. access method plus path

a. Http.request'uri access path

a. access host address

a. Http.field ABCD names HTTP package parameters

a. Ip.src'host'source IP address

a. Ip.dst_host destination address

a. Http.request'full'uri URL address

Str (c.http. File? Data) returns the package content

Appendix 4: SMTP protocol interpretation

The well-known port number of the SMTP protocol server is 25. Similar to the previously summarized telnet protocol and FTP protocol, both the client and the server of the SMTP protocol interact in the form of command and response, that is, the SMTP client sends the operation request to the SMTP server through the command, and the server responds to the response request through the three digit number. SMTP stipulates 14 commands and 21 response information, each of which is composed of 4 letters. Generally, each response has only one line of information, starting with a 3-digit code, followed by a very simple additional description.

Mail delivery mainly includes three stages: connection establishment, mail delivery and connection termination.

Establish connection phase:

1. When the SMTP client scans the mail cache every certain time, if any mail is found, use the well-known port number 25 of SMTP to establish a TCP connection with the SMTP server of the recipient's mail server.

2. The SMTP server of the receiver sends "220 service ready" to tell the client that it is ready to receive mail. If the server is not ready, it sends code 421 (the server is not available).

3. The client sends helo message and uses its domain name address to mark itself. The purpose is to notify the server of the client's domain name. It is worth noting that in the TCP connection establishment stage, both the sender and the receiver tell each other through their IP address. (helo message is the original one, neither user name nor password is encrypted. Now it is changed to Ehlo, and the user name and password are sent with Base64 encoding.)

4. Server response code 250 (request command complete) or some other code as appropriate.

Mail delivery phase:

After establishing a connection between the SMTP client and the server, the sender can exchange a single message with one or more recipients. If there is more than one recipient, steps 3 and 4 below will be repeated.

1. The customer sends the mail from message to introduce the sender of the message. It includes the sender's email address (email name and domain name, such as House @ QQ). This step is necessary: it can give the server the return email address when returning error or message.

2. Server response code 250 (request command complete) or other appropriate code.

3. The function of the RCPT command is to find out whether the receiver system is ready to receive the mail before sending the mail. In this way, communication resources are not wasted, and it is not necessary to know the address error after sending a long mail.

4. Server response code 250 or other appropriate code.

5. The client sends the data message to initialize the message transmission. The data command indicates that the content of the mail is about to be transmitted.

6. Server response code "354 start mail input: end with < CRLF >. < CRLF >" or other appropriate messages (such as 421 server is not available, 500 command is not recognized).

7. The client sends the content of the message in consecutive lines. Enter < CRLF >. < CRLF > at the end of each line, i.e. enter to wrap the line. Enter to wrap the line, indicating the end of the mail content.

8. Server response code (250 request command completed) or other appropriate code.

It is worth noting that although SMTP uses TCP connection to try to make the delivery of mail reliable, it does not guarantee that mail will not be lost. That is to say, using SMTP to send mail can only be said to be able to reliably send mail to the receiving party's mail server. In the future, we will not know. The recipient's mail server may fail, losing all of the received servers (before the recipient reads the message).

abort connection

After the message is successfully transmitted, the customer terminates the connection. It includes the following steps:

1. The customer sends the quit command.

2. Server response 221 (service shutdown) or other code.

After the connection termination phase, the TCP connection must be closed.

Note: This article is an original reward article of "hetianzhihui". It is forbidden to reprint in any form without permission!