build a scanner to automatically scan the whole network vulnerabilities

Posted by trammel at 2020-03-09

Author: langzi

Gift book: white hat talks about web scanning

Activity address: free book in March

In penetration testing, scanners are essential. After all, there are many targets and many detection points, which can't be done by hand. So many penetrators have their own automation tools or scripts. Here, I will share an automatic whole network vulnerability scanning tool developed by myself.

Scanning principle

The scanner built by Python + MySQL is mainly used to automatically collect websites and scan the regular vulnerabilities of websites. We hope that we can automatically discover sensitive information, or find loopholes in websites or hide exploitable loopholes when we hang up

The core functions of software engineering must meet the following requirements.

1. Can crawl infinitely to collect the surviving URL links on the Internet

2. It can verify whether the link collected is alive by scanning

3. Load balancing of MySQL database and server

Scanning and verification function of vulnerability.

1. Backup file scanning function

2. Svn / git / source leak scanning function, including webinfo information scanning

3. Editor vulnerability scanning function

4. Automatic detection of SQL injection vulnerabilities

5. Website verification function using struts 2 Framework (malicious)

6. Scan website IP and scan dangerous port function

7. CMS type identification (main function)

Software flow

Result display

As shown in the figure, the backup files scanned on the hook, sensitive information leakage, injection, CMS type identification, ST2 framework, port opening, etc. are all displayed. As long as the vulnerability report is written in detail and more diligent, it can pass the audit. There is no need to brush holes, think little, and rank.

User interaction mode

You can't avoid the problem of database configuration if you need to use MySQL database. First, you need to store the vulnerability information collected by the software. You can write a statement of database structure, and then let the user execute the SQL file to create the database.

Secondly, the connection of the database. As for the address, account number, password, etc. of the database, because everyone's environment may be different, there must be an interaction process in this step. Then, is it necessary to fill in the database configuration information to let the user enter the password of the database account after running the software directly or to create a new configuration file? I chose the latter and created a new config.ini, which not only fills in the database configuration information, but also has the core functions of the software. For example, if you only want to collect the website and scan the backup file infinitely, you can do a good job of configuration in this file.

In order to avoid the data repetition that everyone crawls to, my idea is to let the user collect some URLs as the initial URLs, and then start crawling infinitely based on these URLs. If it is the first run, you should be prompted to import the initial URL. What about the second run? Do you want to continue importing the initial URL and continue crawling? That would be too artistic. My method is to have a configuration item in the config.ini file mentioned above. If it is run for the first time, it will be written to run for the first time and saved, and then check whether it is imported or not before each run. (you can extend it here. If you want to crawl infinitely, some memory garbage will not be recycled in time, and then the whole scan will slow down after a long time, so you need to add the automatic restart function. )

The collected data must be viewed, saved and used, because the data are all in the database. I want to write another software for exporting data, which is convenient for some users who don't understand mysql. But later, I thought it was unnecessary, as if I insulted everyone's IQ. So to summarize the idea, first write two files = > config.ini (database configuration file and scanner configuration file) & database.sql (database installation statement) = > then users collect some websites by themselves, save them in a TXT file in the current directory = > configure the relevant files and environment, and then start the scanner.

Engineering framework

There are many functions to be implemented, so in order to facilitate maintenance in the future, each function should be enclosed in a function, which takes a parameter URL is used to verify the parameters passed in. There are many points to pay attention to in the verification process. For example, in the infinite collection website function, visit the collected website once first to see if it is alive, avoid adding the website that cannot be accessed into the database, and waste time and resources. The way to verify the survival should use the head access method to determine the returned status code.

If you want to judge the CMS category, you should not only scan the CMS fingerprint mode, but also search for keywords on the page and visit robots.txt to judge the CMS category. At the beginning of SQL scanning, I crawled through the page directly to find suspicious injection points, and then added single quotation marks, brackets, backslashes, etc. to match database error reporting statements. Although the process was correct, it was not very appropriate in engineering. Later, in 098 In the version, create a new table in the database to store the injection links crawled. If you think the monitoring injection methods provided by the scanner are not comprehensive, you can also export these crawled links, and then use sqlmap-m to detect the injection points in batches. There are many other things to note here.

Improve fault tolerance and optimization

In the process of scanning, there must be false positives. For example, in the function of verifying vulnerability, using the ST2 framework website, my idea is to add the common keyword suffix, and then judge the keywords and status codes returned from the page. There must be false positives here. What needs to be improved is to find more keywords of the false positives page, and then filter them. There are also editor vulnerabilities. I only loaded the vulnerability scanning verification of WebEditor and fckeeditor, so it is not very comprehensive (limited personal energy).

Generally speaking, no one is willing to use their own baby computer to hang up and run. They are all left on the server. My server is a Tencent cloud 1c1m2g host. When running version 095, the CPU of three threads is basically kept at about 40%, which is OK. But when running 098, the same thread is configured, and the CPU lasts about 90%. The server load is too large to run other service applications, so I did thread synchronization in the program, and optimized some places, CPU utilization dropped to 20-40, but the result is that the whole scanning speed slows down, so I tried to open 5 threads, and the CPU is basically stable 80%. After automatic restart, the user-defined restart time is basically controlled between 40-70, which needs further optimization.

Code function supplement

For example, when scanning the backup file, the method is to import the dictionary of the backup file. But there are some websites whose backup files are named after the domain name. such as

For such backup files, the domain name needs to be cut and spliced into a dictionary. Of course, the suffix cannot be limited to rar, zip, Bak, SQL, tar.gz, Three methods are used to detect CMS categories one by one. If the first method successfully identifies the CMS category, then it will not continue to implement the next two methods. In this way, resources and time will be saved, and data will be wasted. You only need to know the CMS category used by this website. There is no need to know how many methods are detected.

Of course, there is SQL injection. The sequence of sqlmap detection injection is B (blind injection) e (database error injection) U (Union injection) s (multi statement injection) t (time-based injection). Generally speaking, blind injection is supported by database injection, joint injection and other methods. What do you mean, blind injection can find and detect injection, which is the most comprehensive and high fault-tolerant detection method. However, the detection injection recognition method used by this scanner is e (database error), and it will be updated later. On the one hand, the exploitation conditions of other editor vulnerabilities are limited, and the editor vulnerabilities reported on the Internet are basically a few years ago, and I have not added them.

Yoland Liu sensitive intelligence scanner

One day, I didn't mean to talk to Peiyao about this topic. Seeing her interested, I explained the core functions and engineering design thinking of this scanner in detail, but the architecture code in front of me was too ugly (that's why I didn't dare to open source / cover my face) and I didn't want to maintain it. Next, the whole project was given to her. Based on the core idea, she wrote a new version and added new functions. It provides a more complete framework for the 0.98 version that I continue to update later. In order to thank Mr. Liu for his outstanding contribution, the scanner was named yoland Liu.

Advantages and disadvantages

Advantages: repeated processing of database data, three new CMS verification methods, including identifying page keywords, robots.txt file keywords, in addition, scanning using ST2 framework website, database optimization, original code rewriting, more stable and convenient maintenance.

Disadvantages: Although the overall framework has been optimized, memory garbage collection has not been done, and the thread aspect has not been well controlled, resulting in slower and slower speed if it is hung all the time. Because of the professional problems, some vulnerability scanning functions have not been added, so the work of making wheels falls on me. Later updates are based on version 0.95 plus other vulnerability scanning functions.

Additional explanation

Although the 0.95 version is not optimized in memory, it can solve the problem by adding the automatic restart function. Compared with the 0.98 version I wrote later, the 0.95 version is faster (because there are fewer project functions to scan), more stable (although there are many functions added in the 0.98 version, the server load is very heavy, although I am at 0.98 In the version, memory optimization and automatic restart are added, but only 5 threads can be opened at most (CPU: 80 +%) and 3 threads can be opened at least (CPU: 40 +%). It should be noted that at least 3 threads must be opened to open multithreading.

Latest version 0.98 download address:

Password: frwp (default Lang for unzipping password), the first version of this version is Xinan Road

configuration file

When you decompress the files, you should pay attention to the following files, which are related to database configuration, software personalized scanning, thread setting, initial scanning and importing website, etc.

Usage method

1. First install MySQL database

2. Execute the mysql.sql file and refresh the following to find that there is more than one database

3. Configure config.ini. The upper part is related to database configuration settings, and the following is software scanning function settings.

It should be noted that config.ini is a configuration file. The above database configuration is well understood. You can set the database address to or localhost.

For more diversified use, you can selectively use some functions, depending on whether the parameter is 0 or 1. Here, 0 means off, 1 means on. If you want to only detect CMS, set cmsscan = 1, In addition, thread_scorresponds to the number of thread s. New_start is to detect whether to scan for the first time. The default value is 1. If you set 1, you will be prompted to import some websites as the initial website every time you open it. When the program detects that the initial website is loaded, it will automatically change this parameter to 0.

About importing the initial website, you can collect some URLs and save them in a text in the main program folder. When prompted to import, you can enter the name of the text. During the second run, there is no need to configure. The scanner will automatically obtain data from the database and crawl infinitely. If you have any questions about this, you can contact me with QQ.


One year before the birth of this scanner, in February 2017, I also wrote a failed scanner (iosmosis scan), so the latest version of 0.98 can vaguely see the shadow of IOS scan. It is undeniable that IOS scan is a failure, but most of the framework blueprints, such as backup files and source leak functions, have been identified for current scanners. IOS scan also integrates database, FTP, telnet and other blasting functions... It's still a bit dull.

In the future, we will continue to update and add new functions, following the core idea of this scanner > > > infinite permanent Auto crawling. Infinite automatic detection is the soul of this scanner, like a tireless spider, weaving the web larger and larger. The scanner will always be updated free of charge. Please wait.