log analysis series (1): introduction

Posted by tzul at 2020-03-17

This series of stories is a pure fiction, if there are similarities, it is a coincidence

Small B is the security siege lion of Q company. Recently, Q company is not peaceful, and there have been several security attacks. But after the event, little B couldn't find the real reason for the attack, so the leader ordered little B to detect and trace the attack event, or he could go home.

Hearing this, little B was sweating, found the record documents of the recent attacks, carefully analyzed the reasons for not finding the truth, and summarized as follows:

After finding out the reasons, little B knew that to solve these problems, a unified log analysis platform must be implemented. With this platform, you won't have a black eye in the next attack.

Small B says purpose

To build a unified log analysis platform, small B needs to report to the leader first and get the support of the leader. The first is to explain to the leaders the purpose of building a unified log analysis platform.

Log analysis is a very basic core technology in the enterprise. It is not only used in the security team, but also in the IT R & D team and business team. The purpose is different:

Log lying in the hard disk has no value. Log analysis technology can realize the value of log information. The higher the value of the log is, the more it reflects the company's technical strength.

Update of log analysis

Log analysis is not a new technology, although it has been changed a lot of new clothes. But for a company with only a small security team, don't pursue the map cannon on a large scale. What you need is the idea or tool to solve the problem, even a command script with only 20 lines. Since ancient times, small B has divided the log analysis experience into four times:

Stone Age: in this era, when we do log analysis, we rely more on Excel and terminal commands (awk, grep, sort, uniq, WC, etc.). Here we recommend you to read the web log security analysis skills and web log security analysis (, which refers to the case of using commands for security analysis.

Iron Age: in this era, when we do log analysis, we rely more on scripting tools (self writing tools), simple interactive tools (logwatch, logparser), etc.

Industrial era: in this era, when we do log analysis, we have made too much progress compared with the previous two times. Various open-source, free and paid software are available, such as elastic series, Splunk, arcsight and so on.

Future era: in this era, I don't know what I will use when doing log analysis, but at present, machine learning and artificial intelligence should be one of the cores.


Little B thinks that no matter what era we are in, the past ideas and tools are irreplaceable. Because that's an excellent product left by an era after the survival of the fittest. In essence, the products of the new era are also evolved through a long history. (inner OS: a small number of tools or systems are spicy chicken, so please use them carefully! )

How to realize the unified log analysis platform

>>>>Unified log analysis architecture

Unified log analysis architecture

The implementation of unified log analysis platform is different in different enterprises. It is mainly manifested in the applicability of the platform to the enterprise, the technical ability of the enterprise itself, and the different choices of the technical team for the products. But the core architecture is basically as shown in the following figure:

>>>>Difficulties in unified log analysis

Difficulties in unified log analysis

As a security engineer, little B naturally thinks that security is the most important (in fact, this is wrong), but little B will also flexibly mix with it R & D team and business team to promote together, with many people and great power. Although everyone's analysis purposes are different, the unified log analysis platform architecture is what everyone needs.

>>>>Analysis thinking in safety scene

Analysis thinking in safety scene

If you have to classify security scenarios, small B can be divided into known scenarios and unknown scenarios. In known scenarios, our common analysis methods include regular expression based analysis, statistical aggregation based analysis, and association based analysis. In unknown scenarios, we mainly use data mining to mine unknown things from data.

Regular expression analysis

This kind of analysis method is mainly applicable to common attack scenarios with characteristics, such as:

Based on statistics and aggregate analysis

This kind of analysis method is: make statistics and aggregation in different dimensions as much as possible, and mine valuable information according to the results of statistical aggregation. The most common is the analysis scenario: the operation information of a client to a server in a unit time.

Based on association analysis

This kind of analysis method needs to have certain basic data, which can be used to draw inferences from other cases, such as

Data mining analysis

Thinking of log analysis: thinking about various scenarios, making rational use of existing information and available information to make information produce value.

>>>>Optimize log platform

Optimize log platform

The unified log analysis platform is not a simple system. It's finished by making a big screen biubiu. Building a log analysis platform is a long-standing project that integrates technology, communication and operation. 50% of the people died at the beginning, 30% died in the middle, 15% died in the first step of success, and only 5% completed the platform.

There are too many log analysis pits. If the leaders support human and financial resources, they can consider buying a set of products and then having someone to maintain it is the best state! If leaders don't support human and financial resources, forget it! It's also good to make use of operation and maintenance.

According to small B, the work of the unified log analysis platform can be simply divided into two stages:

日志规范化 --> 日志采集 --> 日志存储 --> 日志分析 --> 日志展示 --> 告警实现

For platform optimization, it can be simply divided into (small B can think of so many):

The key is: continuous operation, capacity precipitation and data precipitation

The above are some key points in the business case of unified log analysis platform reported by little B to the leader.

Maybe it's because it's very difficult to write! The topic of log analysis is difficult to write at the current level of small B, so you will have a look.

☑ next: little B is about to realize the unified log analysis platform!



Log Analysis series (external 1): nginx obtains real client IP through proxy