exploration and practice of mobile security in internet companies

Posted by fierce at 2020-03-14


Due to the popularity of mobile Internet, mobile applications have replaced the traditional PC and web, covering almost all Internet services. At the same time, the competition of similar products in the industry and the perspective of the black industry have gradually shifted to the business of the mobile terminal, which directly leads to the intensified security attack and defense of the mobile terminal in recent years. On this basis, the derived attack means are naturally endless. The technical scheme of the attacker is becoming more sophisticated in the continuous attack and defense confrontation, and the concealment of the attack behavior is also improving.

Mobile application security is an important level in the whole complex mobile security ecosystem, and it is also the most closely related level with enterprise business security. Although the system level security vulnerabilities can cause the impact and major harm of prevention, on the one hand, the threat caused by the system vulnerabilities is less likely to occur, on the other hand, when such vulnerabilities occur in practice, the enterprise often can only arrange temporary emergency strategies to protect the corresponding business from infringement. Therefore, for the time being, we will not care about system level security, but focus on the mobile application and business security that Internet enterprises should pay most attention to.

In today's mobile ecological environment, the security issues that I understand have a great impact on the business are mainly divided into three categories:

Products derived from reverse analysis, such as wool, data crawler and plug-in, etc

Data leakage caused by lax authentication of front and rear interfaces

Large or small security risks caused by component permission and fuzzy input boundary

The author of the above issues combined with their own experience in the security industry has carried out some thinking and exploration, trying to gradually dismantle the complex issue of how to ensure the security of mobile services. In the past year, we have carried out some specific program practice, and here we share some immature thoughts and experiences in the process of exploration.


Compared with traditional IT companies, the proportion of mobile business in Internet enterprises has increased greatly, and the business is more complex. In order to ensure the security of different mobile services in a timely and comprehensive manner, the author will analyze the causes and solutions of the security threats from three directions.

In addition to using technology to incubate products to fight against risks, what we tend to ignore is the standardized mobile security requirements and popular mobile security training. At present, except for a few companies such as Google and Amazon, the proportion of internal and external detection of security risks in most companies is not optimistic. There are also two types of companies, one is the lack of awareness of risk when the product is released, the other is the known vulnerability has been fixed when it is released. Although it is known that the vulnerability has been fixed when the product is released, the version throw running online will expose new problems over time. If such problems are not perceived, it will eventually become a security risk threatening the business entity.

The security development lifecycle, also known as SDL, can minimize the security related vulnerabilities in the design, code and documents of the product in the life cycle, and eliminate the vulnerabilities as early as possible before the product version returns. Science popularization propaganda and normative system are actually limited to business R & D personnel in their daily work. They can only be used as a norm to guide and restrict the daily code iteration of R & D personnel. However, the unreliability of personnel tells us that we must have corresponding supervision means to ensure the implementation of the system and transform the risk into controllable risk as much as possible We need the ability to shut down the door and beat the dog, otherwise all systems are empty talk.

Where is it more suitable to put the audit node of "shut the door and beat the dog"? This is a question worthy of consideration. If the product is a white box audit scanner, it seems more reasonable at git submission, but the author assumes an extreme situation: if there is no unified coding specification within the company and the business code is chaotic, then it is really appropriate to set the audit node here.

I prefer to audit the compiled products. There will be multiple locations for black box audit. Even when determining the specific application set on CI platform (continuous integration), we need to consider whether the initial audit granularity is based on app or SDK components. Thinking about serving the business as a security party under the operational thinking, the first thing to stop is to cause excessive interference to the normal development of the business. However, I think it is impossible to audit the whole app according to whether the SLA set by the audit node can not cause trouble to the business within the expectation. According to Canary mode, I recommend setting the granularity to the SDK component level. There are several advantages of doing so: first, it will not cause excessive interference to the business side, second, modular development has become normal in a company of a certain scale or above, and it can be predicted that the audit node of Ci platform will be turned on and blocked after the inventory is not fully cleared by the formula.

Automated security audit

In my opinion, it's not cost-effective to audit the security of mobile app completely by heap manpower. It is necessary for the results of manual audit and review automatic audit when the large version of the business is updated, but the significance of relying on manual audit for the daily version iteration in most cases is not great, so the automatic monitoring for this situation is also expected. At least five parts are needed to quickly build an automated audit platform:

For the application of Android platform, we analyze the relevant struct in the DEX file format, and complete part of the audit work by matching certain rules. At present, there are two different ways for static analysis of Android programs, one is to conduct automatic audit through Java pseudo code, the other is to conduct automatic audit through SmalI assembly, both of which have obvious advantages and disadvantages:

The first is to decompile the DEX file into Java pseudo code through some tools such as dex2jar (of course, it can also be self-developed, and the implementation is not very difficult), and then use the set rules to "pseudo white box audit" through regular matching to get the file with security risks, and the rule establishment is relatively simple. It sounds like an elegant implementation, but the point is that some functions can't be restored to Java pseudo code by dex2jar, so there will be some "missed fish" in this way, and the working time of a single job will be greatly prolonged due to the conversion of Java pseudo code.

The second is through the analysis of SmalI assembly, this way can avoid such problems, through dex2smali type tools to decompile DEX files into SmalI assembly. The overall implementation idea is similar to the design of a disassembly engine. Before implementing rule matching or other functions, the most basic function is to define the function boundary. It's gratifying that we don't need to define the function boundary as much as when designing the disassembly engine Method can easily distinguish function code blocks, and then perform "pseudo black box audit" by rule matching to get the name.smali and line number with security risks. Compared with the former method, this detection method has a smaller granularity, and can quickly locate specific methods under specific classes, and the working time of a single job will not need to convert Java pseudo The code is greatly shortened, but the establishment of rules is relatively complex.

IOS The application of the platform can't be audited in the form of pseudo code as Android does. Fortunately, most of the development languages of IOS projects on the market are based on Objective-C, because the implementation mechanism of runtime is saved in the form of class name, method name and attribute name in the macho file format. In most cases, the key strings can be used by regular matching To detect unsafe functions, frameworks, etc.

If we only rely on the macho file and other files in the IPA package to extract information, it can not meet the needs of static audit. A considerable part of the matching work is based on the point feature code of arm assembly. If we want to do more in-depth and not limited to feature code matching in this part, we can actually introduce a set of disassembly engine to achieve more interesting And practical function is a good choice. In terms of the selection of disassembly engine, capstone is recommended. This engine is transplanted from the components of llvm project and separated. The types of instruction set supported are the most comprehensive among the known disassembly engines. Unfortunately, because of the reason of migrating from llvm (the project development language is transplanted from C + +), the project is "bloated", and the memory consumption will be more than one The engine is much higher, which is also a flaw in beauty.

In fact, there is not much to say about dynamic audit here. Generally speaking, it can be divided into several categories: output information, data storage, network request, sensitive data use, IOS background snapshot and so on. Everyone's way is hook some similar points, such as the API of communication receiving and contracting, and some open-source frameworks.

There may be two points worth talking about. First, the data obtained from dynamic audit can be reused in some other directions. For example, the request and response of network request can be connected to the web scanner to scan some back-end problems synchronously. The other point is that if you want to do a good job in automatic audit, you need to consider how not to increase the scope The scope of detection is how to increase the depth of automation and call more business code logical paths. The common practice is to start with the traversal of UI controls. But like hook, this method has an unavoidable problem: "the deeper the embedded point is, the more messages and events will be processed". The intuitive feedback shows that the cycle of a job is very slow. At present, it can only be compared with Only an acceptable intermediate value is taken between availability; finally, data visualization and accumulation are very important, which can effectively shorten the emergency response time.

Code layer reinforcement

Using the automatic way to avoid the common risks before the product goes online, but there are still some hidden deep security problems after the security audit. This part of the hidden danger has been running on the user's equipment with the version going online. What we can do is to increase the cost of using the hidden danger or vulnerability. After the app goes online, it will face many risks. In addition to the problems not found in the release life cycle mentioned above, it will also face the risk of being analyzed, hijacked and utilized by the third party. How to protect the app code is also a problem Became the key to the problem.

The evolution of Android reinforcement can be roughly divided into the four generations described in the figure above. Currently, there are some episodes, such as the first generation from the initial decryption and landing loading to the dynamic loading without landing, the first and second generation of reinforcement and hybrid encryption before the formation of Mobile Virtualization protection. These relatively more detailed details are not shown in the figure above.

From the perspective of attack, the first generation of reinforcement encrypts the DEX file as a whole and loads it through the way of self realization of dexclassloader. Later, each manufacturer upgrades it and does not decrypt the DEX file and then land it for encryption. After decrypting it directly in memory, it uses self realization of dexclassloader to load it. This idea is consistent with the idea of loading DLL in the memory of Windows platform in the early days, so I This improvement is also preferred to be called memory loading. However, no matter how simple the first generation of reinforcement shelling is, it can be achieved by reading and writing related functions of hook files or memory dump.

In fact, the DEX file of the second generation of the app file after reinforcement is "missing". In the running state, the methods that have been extracted from DEX are mapped and completed in the memory opened dynamically, so as to meet the requirements of function call. This kind of reinforcement needs to get the specific classes and methods of hook part functions during virtual machine processing, and then dump the memory and modify the DEX file structure to achieve shelling.

We started from virtualization to design the first phase of reinforcement, which is the third generation of reinforcement products mentioned above. This type of reinforcement product needs to design its own instruction set interpreter. From the perspective of design and implementation, the technology implementation cost is relatively high. Combined with the idea of the previous generation of code extraction, the methods to be protected are extracted from the DEX file, translated into vopcode (virtual operation code) according to the instruction set rules of the self-developed virtual machine interpreter, transferred to the native layer and called through the JNI interface of Java. The unprotected methods will continue to run on the Android native virtual machine.

After the first phase of R & D and reinforcement, we conducted a compatibility test comparison between self-developed reinforcement and several well-known reinforcement. The test method is to randomly select 50 popular models based on the third-party test platform for compatibility test. The specific data are as follows:

Our final test data shows that there is a mismatch between self-developed products and commercial reinforcement products on some models of Android 4. X system (during the research and development process, we found that even the original unreinforced APK files on some models will fail to install or run). In order to solve this problem and improve the availability of security products, our team will research and develop the reinforcement products in a more stable direction, such as java2c, which converts Java logic into C code through instruction set translation, and then uses a specific compiler and modified NDK (native development Kit) compiles it into a so file and calls it through JNI interface when using it, which can not only improve the compatibility of the adaptation, but also solve the problem that the strength of the traditional java2c becomes weak. Through the comparison of QA data, the performance loss of our team's self-developed reinforcement is about one tenth of that of the same strong commercial products in the industry.

As for IOS, we can build tool chain through llvm framework, compile code files of different languages at the front end for lexical analysis to form abstract syntax tree ast, and then convert the analyzed code into llvm IR (intermediate Through the pass optimizer module, the IR of the middle layer is optimized and confused, and the IR is optimized through a series of passes; the back end is responsible for interpreting the optimized IR into the machine code of the corresponding platform. The advantage of llvm is that different front-end languages are ultimately converted into assembly instruction sets of the same IR platform, and any part of the project structure can be used independently.

Because of the audit of app store on the shelf, we can't do the dynamic decoding operation just as we do the reinforcement and shell on other platforms. There is an open source project called ollvm, which is an open source project of a university laboratory in Switzerland. Based on the llvm compiler framework, this project provides a set of security compilers for code obfuscation during compilation to increase the difficulty of reverse engineering. We also choose to develop IOS reinforcement products based on llvm framework, but because of the excellence of ollvm project, many reverse engineering practitioners will go to understand the realization principle of its function through code related documents (wiki, annotation), so we have to reconstruct the false control flow (bugosconsolilflow), flatting, and substitution, as well as string encryption and some pass assisted detection (this article was written in the middle of last year, and we have completed the transformation of llvm back-end up to now, not limited to the front-end).

In addition to code confusion, we also need to pay attention to symbol confusion. Our team developed symbol confusion earlier than mt-ollvm project. During the process of symbol confusion development, we carried out several product upgrades. At first, we committed to patch the macho file after linker, which is not so elegant and it is time-consuming to fix the compatibility problem, and finally abandoned. Just in preparation for upgrading, my team and I are developing the mt-ollvm project to integrate this function into the mt-ollvm project. However, there are two problems in the product availability of this implementation scheme. In the company, there are bound to be a large number of internal pod libraries for cross team use. Based on the confusion during the compilation period, pod libraries cannot be supported and we cannot modify them gracefully during the compilation period Xlib and storyboard files. In order to solve these two problems and enhance the usability of the product, we integrated the previous two design ideas to extract symbols from the compiled macho file, and then involved in the precompiling stage to recompile and complete the confusion to support the pod project and dynamically modify the Xib and storyboard files. In the process of promoting product coverage last year, we reflected on some shortcomings of the project after it was put into use. In large projects, doubling the compilation time is a problem that puzzles the business side. Finally, we determined a new improvement scheme based on the source file for syntax analysis and symbol extraction, replacing the original scheme extracted from the macho file, greatly improving the availability Sex.

Communication layer reinforcement

Code obfuscation technology is one of the most direct and effective ways to fight against reverse analysis, but people often ignore that the communication between the client and the server is also the weak point of the attack after only strengthening the binary, thus providing the entry point of the attack behavior. In all the cases I have seen, I always feel at ease after reinforcing binary system, but there is a "cognitive blind spot" in this cognition. For example, if the analyst is concerned about how to crawl data through the app, and the app does not have any fields that need to be encrypted or decrypted in the request and response during communication, the analyst can easily get what he wants. Fortunately, most of the important interfaces or common communication interfaces of the mobile app in today's enterprises will have some fields that need to be calculated by algorithms, and the protection of this part depends on whether the security countermeasures ability of the reinforcement products used by the app itself is enough to block the primary and intermediate analysts.

When the client sends a request to the server, the request is verified to generate a verification value and added to the request for transmission to the back end. There is a corresponding verification script at the nginx layer to verify the request sent from the client. If the request fails, the request will be terminated. If the request succeeds, the request will be released to the business server.

When designing a product, we usually need to design one or more schemes to ensure that it can cope with some emergencies. I think the relationship between the reinforcement of the communication layer and the reinforcement of the code layer is the same, that is to say, "each other's bottom". No matter how excellent the communication layer reinforcement can't solve the weakness of hook which is easy to be analyzed, and no matter how excellent the code layer reinforcement can't completely solve the hidden danger of external protocol analysis. But if your company happens to have two excellent products, it's the best choice to make them each other's horns, and at the same time, it will greatly improve the security of mobile products Sex. Of course, we can also add some security countermeasures, environmental detection and some self verification related functions, so as to add more defense strategies, better guarantee the data integrity and confidential transmission.


There is no absoluteness between attack and defense. In the life cycle of mobile products, even if we have completed attack and defense confrontation based on the client, security detection before online, integrity verification based on the network communication layer in business operation, and standardized SDL detection, this series of defense can or can only help us block 80% and 90% of the attack behaviors. Finally, there will still be some attacks The assailant of the old way can do some indescribable things. After the development of Internet enterprise's mobile business to a certain scale, the solutions output by the mobile security direction also need to be more closely connected to the level of ensuring the company's business, rather than limiting itself and the team to the traditional sense of attack and defense dilemma. Therefore, after the above-mentioned basic components of mobile security have been put online, we will continue to explore the close fit with business security in addition to our deep work in the existing fields.

The appropriate time point will share the business security practice based on mobile security. See the next section~