IMCAFS

Home

tencent security zero distance big eye -- large network traffic analysis system software

Posted by millikan at 2020-03-06
all

Introduction

The big eye system is a DDoS attack discovery system independently developed by Tencent Security Platform Department since 2008. After more than five years of development, it has developed into a network traffic analysis system that integrates DDoS attack discovery, intrusion behavior discovery and basic data analysis. Through efficient multi-layer load sharing, it achieves the analysis ability of large traffic in the form of cluster.

This paper was created by the team of Tencent aegis, aiming to share some experience in the development and construction of traffic analysis system. Limited by the author's vision and level, there may be deficiencies or even mistakes, hoping to get the correction and suggestions of the big bulls.

Overview

The traffic analysis system uses the splitter to mirror a copy of the traffic for analysis before the IDC core router. The specific deployment location has been explained in detail in the previous blog, and will not be described here. The whole system can be logically divided into several modules, such as receiving, unpacking, DDoS detection, application layer preprocessing, and application layer detection library, as shown in the following figure:

High speed collection and contract awarding

At the software level, the first thing to be solved is to collect all the traffic. It is obviously unrealistic to adopt the protocol stack of operating system standard. In different periods, aegis team successively adopted the solutions of kernel hook, Libpcap, libpfring, special hardware and dpdk.

        The first three are all based on 1g network cards. Under the premise of ensuring no packet loss, each server can only handle about 500mbps of traffic. With the continuous increase of the overall traffic, the scale of the analysis cluster will expand rapidly, which will bring cost and management problems. Therefore, the team began to seek solutions based on 10G network cards. At this time, the hardware platform of tilera and Cavium began to move forward Enter the field of vision.

After the evaluation of hardware and the evaluation of development difficulty, Tierra is selected as the hardware platform of the new analysis system. Under this platform, a single machine can easily achieve 10Gbps traffic analysis and DDoS detection. However, with the in-depth use of tilera platform, the problem of CPU computing performance is gradually exposed, unable to cope with the increasing demand of application layer analysis, and the project team has to start to seek new solutions.

At the same time, Intel introduced dpdk, which greatly improved the network IO processing capacity of Intel universal x86 CPU. Considering that x86 CPU also has great advantages in stability and general computing power, and developers do not need to understand the new hardware platform, the entire analysis system is once again migrated back to X86 platform. That is to say, the current version of traffic analysis system is mainly running.

Borrow a diagram of dpdk, as shown in the figure, as the middle layer of user application program and Linux kernel, dpdk encapsulates the hardware initialization, configuration, memory allocation, packet sending and receiving and other operation interfaces for users. Using Intel CPU and network card can easily create applications with high packet receiving performance.

Note: this figure is from the official document of dpdk

The comparison of several methods is shown in the table below:

DDoS detection

The system makes statistics on the traffic to each destination address in real time. Every other statistical cycle, it summarizes and judges the statistical traffic, and reports the large traffic as suspected attack, and then asynchronously judges whether it is an attack according to the information of destination address, such as business type and normal traffic at ordinary times. If it is still judged as an attack, it is necessary to further determine whether the attack has caused an impact based on the monitoring data of the destination server. If there is an impact, the linked defense device will automatically clean the traffic, and then determine whether it needs to be handled manually according to the defense effect. The specific process is as follows:

Application layer analysis

Each processing function of application layer is mounted in the form of so library. If each so library realizes its own functions, it will cause repeated operation and waste system resources in actual operation. Therefore, in the system design, more than two modules will use the common functions to the system main program processing, that is, the application layer preprocessing module. At present, the module mainly includes TCP stream reorganization, application layer character transcoding, HTTP protocol parsing, URL access statistics. After preprocessing, the data is called by each module.

The specific application layer detection function is not convenient to describe in detail here, so this paper only introduces the implementation of rule matching engine.

Generally, the rules often used can be divided into string and regular expression. If all rules are matched one by one, it will undoubtedly consume a lot of CPU resources. Therefore, how to improve the efficiency of single rule matching, reduce the number of rule matching, and narrow the scope of rule matching become the focus of optimization performance.

        The first is the selection of regular engine. The popular regular expression matching engine can be basically divided into NFA (non deterministic finite automaton) and DFA (deterministic finite automaton). Compared with NFA, DFA has about twice the performance without some functions. Therefore, our regular engine is implemented by DFA, and most regular expressions can be directly matched Then support, individual needs to optimize the writing of regular expressions, which can fully meet the functional needs in practice.

Secondly, each rule needs to specify the application request type, such as get / post or all types, as well as the application scope, such as only matching parameter fields, or full package matching. Then the rules with the same request type and application range are further combined. The rules of string type are matched once by AC multimode matching algorithm. For the rules of regular type, the fixed string part is extracted from the rules, and the fixed string part is combined with the string rules for a match. For the data hitting the fixed part, the regular matching is carried out again, so as to reduce the The number of regular expression matches significantly improves the efficiency of rule matching. The matching process is as follows:

Some experience of high performance network traffic analysis system

1. Split the logic reasonably and allocate resources as needed

Different working threads are distinguished according to their functions. When allocating resources, priority should be given to ensuring the normal operation of core functions, and the allocation of system resources should take into account different logic costs. Even when the system is fully loaded, the core functions should not be affected.

2. Layer by layer filtering, first light then heavy, reducing the data volume of heavy logic processing

On the premise of not affecting the overall analysis effect, adjust the order according to the importance of the processing logic, advance the logic of public use and less resource consumption, and at the same time, when the data is backward, discard the data that does not need to be processed continuously layer by layer, reduce the amount of data that needs to be processed by the heavy logic, and reduce the overall resource cost.

3. Dynamic load balancing to make full use of resources

To prevent the barrel effect, according to the load of each thread, dynamic load sharing is carried out to maximize the use of system resources.

4. Distributed analysis, centralized management

Distributed analysis and processing can improve the overall processing capacity, centralize the judgment logic and management, ensure the processing performance while maintaining the flexibility of the strategy.