tracking report of global black industry chain of colliding with database under big data

Posted by tzul at 2020-03-02

Recently, CCTV exposed a bizarre case of telecommunication fraud. The victim neither received an unknown phone call or text message, nor was the mobile phone poisoned. All the money in the account was stolen for no reason.

With the deepening of the investigation, the police found that this was a case caused by the illegal elements scanning the login password of the user's online bank by using the leaked password of the user in other websites and using the "collision library" means, and then using the unconventional means to modify the user's online bank binding mobile phone number.

Chinese people are used to dividing Yin and Yang when they look at things. Often how prosperous it is on the surface, how rampant it is in the dark. I remember seeing a report "BOT traffic report 2016" at the beginning of the year, which said that in 2016, robot traffic accounted for 51.8% of the total network traffic, exceeding human traffic, and malicious robot traffic accounted for 28.9% of the total network traffic.

How to capture the expected data from the huge malicious traffic has always attracted us. The team has been doing long-term research for this, and has built many data probes throughout the network, capturing a large number of first-hand data, so that we can have a chance to see a corner of this dark field, so we have a series of reports on black production big data. Today's theme is: Global collision Library tracking.

1、 What is a hit attack

To put it simply, using someone else's account password in website a and going to website B to try to log in is a database collision attack.

In the early years, stealing other people's account mainly relies on Trojans, and password dictionaries are generated by software. With the frequent occurrence of website database leakage events in recent years, database collision attacks have gradually become the mainstream way of stealing accounts. The database collision attack has also become an important part of account type attacks. The following figure shows the whole account type attack chain:


Drag Library: hackers steal user data from valuable websites.

Launder: hackers realize the property or virtual property of user accounts or account information itself.

Social work database: hackers associate all kinds of databases they get and make all-round portraits of users.

Targeted attack: hackers carry out targeted criminal activities, such as fraud, against specific people or groups according to user profiles.

2、 Where to & from

I have answered one of the three philosophical questions "what is x", and then I will look at the other two questions:

1. Source data of collision database

In order to attack the database, hackers first need enough original account data. We analyze the database attack captured on the network and find that the source of the original data is mainly as follows:

1) Envelope number industrial chain

The envelope number is the stolen QQ number. Envelope number industry chain is the industry chain that QQ number steals, sells stolen goods and makes use of profits. Tens of millions of stolen QQ numbers will flow into the industry chain every day on the Internet black market. Originally, QQ account passwords were only valuable in Tencent. However, due to the large-scale use of QQ mailbox, many people directly use QQ mailbox and password corresponding to QQ number when registering users on the website, resulting in a large number of stolen QQ numbers being directly used for website database collision.

2) Website disclosure database

The landmark event of website leaking database is CSDN in 2011 Six million users' data leakage led a wave of data leakage peak in that year. Dozens of websites' user data were disclosed, and a large number of leakage data that was only circulated underground was thrown onto the table, providing enough data sources for hackers who did not pay attention to this path to cut into this direction, which ignited the upsurge of database collision attacks.

Similar events and the leakage of hundreds of millions of accounts in a mailbox in 2015 have provided important ammunition resources for hackers. What's more, the leaked data is only the tip of the iceberg.

3) Underground black market circulation

Data stealing and trading is almost the deepest part of the underground industry chain. Many hackers build a huge social work database through data trading. We can't know how much website data has been stolen and can't be objectively evaluated, but through some semi open channels, we can also get a glimpse of it. Below is a screenshot of an underground data trading market in the dark network:

2. Source data classification

According to the recent database crash data, email accounts for about 1 / 4 of the total, and mobile phone accounts for 5.8%.

3. National attack data

Based on the recent billions of attacks against the database in the world, we have drawn the following global attack data map after aggregation and analysis:

1) Attack traffic direction

It can be seen that China and the United States account for the vast majority of the collision attacks.

2) Proportion of attacked companies in various countries

More than half of the attacked companies are from China.

3) Distribution of attack sources

At the same time, China is also the largest source of attacks, followed by Russian hackers, including Russia, Ukraine, Belarus and other former Soviet countries.

4. Differences in attack data between China and the United States

When analyzing many problems, China's data will show obvious differences compared with overseas data. So we specifically compare the situation of two Internet top 2 countries in China and the United States.

1) Type of company attacked

Chinese hackers are obviously targeting at game companies and have a very obvious tendency to cash in. Due to the developed industrialization of domestic black industry and the large number of game players, the game industry is the first one to be hit by the attacked companies.

The attacked industries in the United States are relatively balanced.

2) Source of attack

The vast majority of attacks on Chinese companies come from China, mainly because it is difficult for overseas Internet companies to enter the domestic market, and there is a double isolation between the market and the language, resulting in even hacker attacks are self-contained, mainly self-produced and self sold.

In contrast, the United States is the favorite place for global hackers, and the fighting ethnic Russians are once again fighting.

5. Major affected industries

1) Game industry

The whole Internet can be said to be the most significant in the profitability of the game industry, so naturally, the underground market of the game industry has attracted a large number of practitioners, and has produced a large number of realization schemes and interest chains. The game industry has always been the focus of hackers, from Trojans to plug-in writing, from the office to private server, from training to gold equipment trading. As a result, game companies should take the lead in such attacks.

2) Copyright industry

With the promotion of book, film and audio resources and the growth of bandwidth, many related resources can be viewed online for payment. When users are not willing to spend time to find resources to download movies and wish to spend much less money to buy a senior member account for use, the relevant account number will become cash value.

3) Social industry

The grey business of social networking sites mainly includes but is not limited to the following categories:

Brush powder, like, list and watch

Private advertising

Erotic social interaction


With the continuous upgrade of risk control strategy of social platform, the old account of social platform (with long registration time) has become a hot resource in some circles, such as a famous stranger social app with a market value of more than 30 yuan. Mastering these resources means that the possibility of being banned is reduced, which means the relative sustainability of the above businesses. The place with more people means business. Therefore, social account has always been an important target of black production.

3、 Attack method & mainstream prevention and control

Through the monitoring and analysis of massive attacks, we can see the attack methods of hackers, as well as the prevention and control measures of manufacturers.

1. How hackers attack

1) Judge whether the account exists

When many websites fill in the registration information, they will use ajax to verify whether the account name can be used in real time. If it is available, check the page. This interface is widely used by hackers to determine whether a user name is registered on the website.

Some websites will return sensitive information to reveal the existence of account if the account password is wrong. For example, if you return the prompt "account does not exist" or "password error", you can let the hacker judge whether the account exists. The return information recommended here is "wrong account or password".

In the process of password retrieval, some websites will have a reconfirmation with prompt information after filling in the mobile phone number or email, which is often used by hackers to determine whether the account exists or not.

2) Prominent problems in centralized management of business security

From our statistical data, the main landing port of many websites often has relatively strict audit measures, which will trigger the verification code or seal IP according to the landing IP, frequency, etc. However, when the company's business increases and the complexity of security management increases significantly, different sub stations use a set of their own login verification, the lack of a unified login interface will be exposed. For example, the landing function of a certain sub product, or the company's website hanging a forum, will often go through a separate landing interface. When these edge business interfaces do not have access to the audit function, they will become a hotbed of hacker attacks.

From the attack data we have captured, we can see a lot of this situation. Hackers can find new collision interface without risk control logic again after being antagonized for many times. Even the Security Department of some landing interface companies does not know its existence. The so-called thousand mile dike is destroyed in the ant nest. Although the main business has done a lot of defensive measures, when the marginal business is negligent, all measures are nothing.

3) Attack effect

As we all know, the risk of colliding with the storehouse belongs to the conventional risk, the core of which is not how to avoid completely, but more about the cost of attack and defense. From the analysis of a large number of attack monitoring, we can make statistics on the effective efficiency and success rate of the black production hit database. We will find a fact that the long-term hit database will bring qualitative damage. Now in the industry, there are often news that a manufacturer is dragged into the database, but most of the data exposure of the hit database finally stimulates the nerve of the media:

According to the statistics of a large number of black collision database data, the number of attacks that can successfully bypass the risk control strategy accounts for 83% of the total number of attacks, and the success rate of collision database is about 0.4%.

2. Mainstream prevention and control

Finish the hacker's attack method, and then see how the manufacturers prevent and control it.

1) Main protective measures

According to the number of requests initiated by the Black IP library or the same IP, the password error rate, etc., it is determined whether to prohibit the request of the IP for a period of time.

There are many types of most widely deployed schemes, such as letter distortion, Chinese character recognition, moving slider and image selection. Ordinary manufacturers directly access the verification code, and those with background analysis ability will trigger the verification code when there is an exception in the background audit to improve the ordinary user experience.

Real person authentication based on the cost of mobile phone and mobile phone number.

According to the behavior of user login process, such as page dwell time, mouse focus, page access process, CSRF token, etc.

Through the client, especially the mobile client, many machine information is reported to identify whether there is a forged device.

In essence, all of the above solutions are to solve one thing, that is, to judge that the opposite side of the computer is a real person.

3. Bypass of mainstream prevention and control

In the face of huge profits, no one is willing to wait and die. Like the manufacturer, the black production personnel not only take the initiative in the face of the manufacturer's confrontation, but also achieve the platform and chain to carry out the anti confrontation. From the perspective of the attacks we have detected, the hackers have perfect schemes to fight against the manufacturers in all dimensions, mainly from the following aspects:

1) Low security edge business or new business

In the face of strict protection logic, the fastest way is to find its own loopholes. Once the edge business interface of non strict audit is found, all protective measures will be bypassed, such as entering the no man's land.

Manufacturers often lack effective monitoring in this dimension, because it is the interface ignored by the security department. However, when we analyze the big data of the black traffic from the perspective of the third party, this trick becomes elusive. When anyone starts to attack the new interface, it is within our monitoring range, which can greatly enhance the response speed of manufacturers to this kind of vulnerability.

2) IP confrontation

IP address is one of the most important risk control solutions for manufacturers as a scarce resource of the Internet. How to obtain a large number of IP exports is also the first problem that the black industry needs to solve. In fact, it's not just black production. Many crawlers, search engines and robots have similar needs. Through the long-term reverse tracking of the source of a large number of database collision attacks, we find that there are mainly the following ways to obtain IP Resources:

Free solution, through the whole network scanning common proxy server port, collecting available proxy IP address, self-management and maintenance, but high cost, low efficiency.

Agents provide global proxy servers by scanning, building or exchanging, effectively reducing the management cost of self collection agents.

Similar to pay agent, but different technology.

In the past two years, we have monitored that a new type of IP acquisition scheme, called dial-up VPS or dynamic VPS, has been gradually applied in the black industry in China. This kind of VPS is also a virtual server, but it needs to dial through ADSL to access the Internet, so it has a large number of available IP in the whole city. It doesn't sound special. Your ADSL can do it, but it's still a miracle. The relevant suppliers have achieved the dial-up method of many provinces and cities across the country, commonly known as mixed dial. It realizes the use of an account in a VPS to quickly and randomly switch ADSL lines in nearly 100 cities to the Internet, which causes great pressure on the risk control departments of many enterprises.

The actual use effect is as follows:

3) Verification code confrontation

As one of the simplest and most widely used automated Turing test solutions, over the past decade, a large number of companies and teams have been trying to solve the problem automatically, so that the verification code has been upgraded to human beings for many times. However, in China, the black industry entrepreneurs rely on low-cost labor to directly use the most violent way to crack the verification code that can not be identified by technology - manual coding, and spread to a large number of third world countries, resulting in nearly one million people around the world to make a living. Therefore, one of the black production supplier platforms is derived: coding platform.

According to the data we have learned, the average wage per yard is 1-2 cents. The skilled workers can make about 20 codes per minute, with an hourly income of 10-15 yuan.

4) SMS verification confrontation

When website developers are used to considering security countermeasures from their own perspective and think that receiving SMS depends on the cost of mobile phones and cards, unexpected innovation is always overwhelming. Look at the following device, commonly known as "cat pool", which can manage 256 SIM cards in a unified way, and then provide external query services for the received verification code through the API interface through the software. From this, another market segment of black suppliers is derived: code receiving platform.

Black industry users only need to pay 1-3 yuan to receive a verification code from such platforms with different mobile phone numbers. The business volume can refer to the love code platform case investigated and punished by the Public Security Bureau in November 2016. The following figure is a picture of the scene. The platform alone has more than 7 million mobile black cards for verification code receiving business, most of which are related to black industry. If you are interested, you can search the details of the case by yourself.

5) Imitating real people's behavior

In terms of the behavior analysis model to evade the background, the requests submitted by hackers are not only completed by filling in a user agent, but also analyzed from the process of database collision attack against hackers. In order to evade the background analysis, many hackers' processes include but are not limited to:

Complete page opening process instead of just submitting requests to key interfaces

Parameters such as CSRF token are complete

Random page dwell time

HTTP header strictly follows browser characteristics

Randomize seemingly unimportant parameters

4. Common problems of mainstream risk control

From our analysis of the effects of various attacks against the database, the following problems exist and are faced by the manufacturers:

Risk control strategy missing in check user exist class interface

Login interface returns sensitive information when login fails

The account system lacks a unified management mechanism, and there are often new or marginal businesses bypassing risk control schemes

Verification based on IP, SMS and verification code becomes a confrontation with professional platform

4、 Summary and Outlook

1. Attack new trend

1) Collision library gradually replaced pilfering as the mainstream attack mode

From our continuous monitoring and mining of underground black production, due to the exposure of various leaking databases, the proportion of traffic hit the database has increased significantly in recent years. On the other hand, due to the continuous improvement of the security of the operating system and browser, the way of downloading or registering the website to steal the account is gradually replaced by the way of colliding the database. Colliding the database has become the main way to obtain the user account.

2) Platformization of black production resources

Compared with the single fight of the early hackers, now the black industry is more like an aircraft carrier battle group. The hackers mainly take the basic account database as the core resource, only providing combat power output but low defense. Other confrontation resources are directly provided by various auxiliary resource platforms, which leads to the manufacturer's confrontation from hackers to various resource platforms. Various meat shields lead to more and more difficulties in risk control audit China resources platform includes but is not limited to the following categories:

Agent provider

VPN provider

Dial up VPS provider

SMS code receiving platform

Verification code printing platform


3) Rapid development of dial-up VPS

Compared with the mature proxy server scheme, the dial-up VPS, especially the mixed VPS, provides a larger IP pool and a more randomized geographical location in China. It can be predicted that this technology will be further applied in the black production.

2. New risk control

1) Core account resource confrontation

According to the attack and defense data of the collision database, most of the current risk control countermeasures related to the collision database actually take place in various frigates of the manufacturer against the black production battle group, and because the different resource platforms of the other party tend to focus on one point for continuous optimization and stronger fighting, the effect of the risk control measures is getting worse and worse, on the contrary, there is a lack of effective account database against the leaked core resources of hackers Countermeasures.

Imagine that if it can be found from the password of the account that the hacker is trying to use to crash the database that it belongs to the account that has been leaked from the Internet, and the risk control logic is directly triggered, then all the peripheral protection means of the hacker are bypassed, and the risk control confrontation is directly carried out from the point that the other party cannot avoid.

It reminds people of Linghu Chong and Chongxu Taoist priest's sword comparison in Xiaoao Jianghu. In the face of the flawless Taiji Sword technique, they can only enter directly from the center of the circular sword light. The place where there seems to be no flaw is the flaw.

The truth is very simple, but this way of confrontation is based on the larger leakage data than the black production. In addition to collecting the online leakage data, there must be other more real-time and effective solutions to supplement the data.

2) Black production big data monitoring

Based on the long-term research on malicious traffic, we can continuously monitor a variety of black production traffic through different schemes, capture hundreds of millions of original attack requests every day, and maintain the accumulation of millions of original account databases every day in the aspect of database collision. In this way, it provides a new dimension for account type risk control confrontation, and can provide a continuous and powerful guarantee.

Why did Dugu nine swords walk alone in the world? Although the "broken sword style" is only one type, it has the same meaning in all schools of the world. Based on the world's swordsmanship, it can break all kinds of changes. This is the thinking of typical big data. In front of the data, no secret! Based on the continuous monitoring of the core account resources of the attack of the Blackpool, it may be the real mental method of "crashing" in account risk control.

Who are we: the threat Hunter team, focused on security for many years, is committed to solving business security risk control problems for Internet companies. The team's first product, "account treasure" (, the first risk account detection as a service platform (cadaas) in China, is dedicated to solving the risk of database collision and high-risk account audit.

*Author: threat hunter, reprint from