1, foreword
The opening and convenience of the web has brought about extremely high-speed development, but also brought a lot of hidden dangers, especially for the core code protection. Since the author engaged in the related work of Web front-end development, he has not heard too many related schemes. The sentence "front-end code has no secrets" seems to be a consensus of the industry in the front-end field. But in the daily development process, we will also involve and need the encryption of the front-end core code with considerable strength, especially in the data communication with the back-end (including HTTP, HTTPS requests and websocket data exchange).
Consider a scenario. In video related products, we usually need to add relevant security logic to prevent direct stealing or broadcasting. Especially for live video, our live video streaming files are usually divided into segments and then generated the corresponding URL parameters through the negotiated algorithm and requested one by one. Fragmentation usually takes 5 to 10 seconds. If the acquisition of fragmentation URL is fully placed on the back-end as an interface, it will not only bring great pressure to the back-end, but also bring the delay of live broadcast request. Therefore, we usually place part of the implementation on the front-end to reduce the back-end pressure and enhance the experience. For IOS or Android, we can write related algorithms in C / C + +, then compile them into dylib or so and mix them up to increase the complexity of cracking, but for the front end, there is no similar technology to use. Of course, since the comprehensive promotion of asm.js and webassembly, we can use it to further enhance the security of our core code. However, due to the openness of asm.js and webassembly standards, the security strength is not as good as we think.
This paper first reviews the current popular front-end core code protection related technical ideas and brief implementation, and then describes a more secure and reliable front-end core code protection ideas (security worker) for reference and improvement. Of course, the author is not a professional front-end security practitioner, and the understanding of some technical security may be a little one-sided and insufficient. Welcome to leave a message for discussion.
2. Using JavaScript's obfuscator
In our daily development process, we are not unfamiliar with JavaScript obfuscator. We often use it for code compression and obfuscation to reduce code volume and increase the complexity of human reading code. Frequently used items include:
- UglifyJS
- Google Closure Compiler
- YUI Compressor
- ...
The principle of JavaScript obfuscator is not complicated. Its core is to perform ast transformation on the target code. We can easily implement our own JavaScript obfuscator by relying on the existing JavaScript ast parser library. Next, we use acorn to rewrite an IF statement fragment.
Let's say we have a snippet:
for(var i = 0; i < 100; i++){
if(i % 2 == 0){
console.log("foo");
}else{
console.log("bar");
}
}
By using uglifyjs for code obfuscation, we can get the following results:
for(var i=0;i<100;i++)i%2==0?console.log("foo"):console.log("bar");
Now let's try to write our own obfuscator to obfuscate code fragments to achieve the effect of uglifyjs:
const {Parser} = require("acorn")
const MyUglify = Parser.extend();
const codeStr = `
for(var i = 0; i < 100; i++){
if(i % 2 == 0){
console.log("foo");
}else{
console.log("bar");
}
}
`;
function transform(node){
const { type } = node;
switch(type){
case 'Program':
case 'BlockStatement':{
const { body } = node;
return body.map(transform).join('');
}
case 'ForStatement':{
const results = ['for', '('];
const { init, test, update, body } = node;
results.push(transform(init), ';');
results.push(transform(test), ';');
results.push(transform(update), ')');
results.push(transform(body));
return results.join('');
}
case 'VariableDeclaration': {
const results = [];
const { kind, declarations } = node;
results.push(kind, ' ', declarations.map(transform));
return results.join('');
}
case 'VariableDeclarator':{
const {id, init} = node;
return id.name + '=' + init.raw;
}
case 'UpdateExpression': {
const {argument, operator} = node;
return argument.name + operator;
}
case 'BinaryExpression': {
const {left, operator, right} = node;
return transform(left) + operator + transform(right);
}
case 'IfStatement': {
const results = [];
const { test, consequent, alternate } = node;
results.push(transform(test), '?');
results.push(transform(consequent), ":");
results.push(transform(alternate));
return results.join('');
}
case 'MemberExpression':{
const {object, property} = node;
return object.name + '.' + property.name;
}
case 'CallExpression': {
const results = [];
const { callee, arguments } = node;
results.push(transform(callee), '(');
results.push(arguments.map(transform).join(','), ')');
return results.join('');
}
case 'ExpressionStatement':{
return transform(node.expression);
}
case 'Literal':
return node.raw;
case 'Identifier':
return node.name;
default:
throw new Error('unimplemented operations');
}
}
const ast = MyUglify.parse(codeStr);
console.log(transform(ast)); // 与 UglifyJS 输出一致
Of course, the above implementation is just a simple example. In fact, the implementation of the obfuscator is much more complex than the current implementation, and a lot of grammatical details need to be considered. Here is just a reference for you to learn.
From the above implementation, we can see that JavaScript obfuscator only changes JavaScript code into another more unreadable form, so as to increase the difficulty of human analysis and achieve the purpose of enhancing security. This method had a good effect a long time ago, but as the developer tools become more and more powerful, in fact, it is easy to reverse the original JavaScript core algorithm through single step debugging. Of course, many libraries have been improved in the future. JavaScript obfuscator tool is one of the representative projects, which adds functions such as anti debugging, variable prefix, variable confusion, etc. to enhance security. However, the code after confusion is still clear text. If we have enough patience and with the help of developer tools, we can still try to restore it, so the security is still greatly reduced.
3. C / C + + extension with flash
In the period when flash was still popular, in order to make it easier for engine developers to use C / C + + to improve the performance of flash game related engines, Adobe opened the crossbridge technology. In this process, the original C / C + + code becomes the target code required by flash runtime after llvm IR, which has greatly improved both in efficiency and security. For the current open source decompiler, it is difficult to decompile the C / C + + code compiled by corssbridge, and because debugging is disabled in the flash runtime production environment, it is also difficult to perform corresponding single step debugging.
It seems to be an ideal way to protect our front-end core code by using flash's C / C + + extension, but there is no space available on the mobile end of flash. Meanwhile, Adobe has announced that it will not maintain flash in 2020, so we have no reason to use this method to protect our front-end core code.
Of course, flash still has a large share in the PC, and browsers under ie10 still have a lot of shares, we can still consider this as a PC compatible solution.
4. Use asm.js or webassembly
In order to solve the performance problem of JavaScript, Mozilla puts forward a new set of JavaScript language subsets, asm.js, which improves the overall running performance of JavaScript from the perspective of JIT friendliness. Later, Mozilla standardized with other manufacturers and produced the web assembly standard.
Whether it's asm.js or webassembly, we can see it as a brand-new VM. Other languages produce executable code of this VM through related tool chains. From the perspective of security, compared with the simple JavaScript obfuscator, its strength has greatly increased. Compared with the flash C / C + + extension, it is the future development direction, and has been implemented by the mainstream browser.
There are many languages and tool chains that can be written to generate web assembly. We use C / C + + and its emscripten as an example to write a simple signature module for experience.
#include <string>
#include <emscripten.h>
#include <emscripten/bind.h>
#include "md5.h"
#define SALTKEY "md5 salt key"
std::string sign(std::string str){
return md5(str + string(SALTKEY));
}
// 此处导出 sign 方法供 Javascript 外部环境使用
EMSCRIPTEN_BIND(my_module){
emscripten::function("sign", &sign);
}
Next, we use emscripten to compile our c + + code and get the corresponding generation file.
em++ -std=c++11 -Oz --bind \
-I ./md5 ./md5/md5.cpp ./sign.cpp \
-o ./sign.js
Finally, we introduce the generation of sign.js file and call it.
<body>
<script src="./sign.js"></script>
<script>
// output: 0b57e921e8f28593d1c8290abed09ab2
Module.sign("This is a test string");
</script>
</body>
At present, it seems that webassembly is the most ideal front-end core code protection scheme. We can use C / C + + to write relevant code, use the emscripten related tool chain to compile it into asm.js and wasm, and choose whether to use asm.js or wasm according to the support of different browsers. And for the browser under ie10 on the PC side, we can reuse its C / C + + code through crossbridge to produce the corresponding flash target code, so as to achieve very good browser compatibility.
However, after using asm.js/wasm, can the protection of front-end core code rest assured? Because the standard specifications of asm.js and wasm are completely open, for the good implementation of the asm.js/wasm standard decompiler, it is possible to produce as much reading code as possible to analyze the core algorithm code. Fortunately, the author hasn't found a good asm.js/wasm decompiler yet, so I think this method can be reused to protect the security of front-end core code.
5. Securityworker - a better idea and its implementation
The author often writes the front-end core related code in his work, and most of these codes are related to communication, such as the encryption and decryption of Ajax request data, the encryption and decryption of websocket protocol data, etc. For this part of the work, the author usually uses the asm.js/wasm plus crossbridge technical solution described above to solve the problem. At present, this scheme looks quite good, but there are still several big problems:
- The front-end is not friendly. Most front-end engineers are not familiar with C / C + +, rust and other related technical systems
- Unable to use the huge NPM library, increasing a lot of work costs
- In the long run, there will not be a large cost of cracking, and further improvement of security is needed
So we spent two weeks writing a better front-end core code protection scheme based on asm.js/wasm: securityworker.
5.1 target
The goal of securityworker is quite simple: to write the core algorithm module with strong security strength as comfortably as possible. In fact, it needs to meet the following 8 points when it is split:
- The code is written in JavaScript, avoiding technical systems such as C / C + +, rust, etc
- It can smoothly use NPM related databases and connect with the front-end ecology
- The final code should be as small as possible
- The protection is strong enough, and the execution logic and core algorithm of the target code are completely hidden
- Browser / Applet / nodejs multi environment support
- Good compatibility, mainstream browsers are all compatible
- Easy to use, able to reuse technical concepts in the standard
- Easy to debug, source code is not confused, error information is accurate and specific
Next, we will step by step explain how securityworker achieves these goals and introduce its principle in detail for your reference and improvement.
5.2 implementation principle
How to improve security based on web assembly? Looking back to our introduction, one of the more vulnerable points of webassembly in terms of security is the disclosure of webassembly standard specifications. Can we solve this problem if we create a private and independent VM on top of webassembly? The answer is yes, so our first problem is how to build a JavaScript independent VM on top of web assembly. This is easy for webassembly. There are many projects that provide references, such as the js.js project compiled based on SpiderMonkey. But we didn't consider using SpiderMonkey, because the wasm code generated by it reached 50m, which is basically not of practical value in the environment of sensitive code volume and size such as web. But fortunately, there are so many embedded engines related to ecmascirpt:
- JerryScript
- V7
- Duktape
- Espruino
- ...
After comparison and selection, we chose duktape as our basic VM, and our execution process became as follows:
Of course, from the figure, we can see that there is actually a big risk point in the whole process. Because our code is embedded in C / C + + through string encryption for compilation, during execution, we can wait for the code to be decrypted and get the core code at a certain run time of memory, as shown in the following figure:
How to solve this problem? Our solution is to change JavaScript into another form, which is our common opcode. For example, suppose we have such code:
1 + 2;
We will transform it into an assembly instruction like form:
SWVM_PUSH_L 1 # 将 1 值压入栈中
SWVM_PUSH_L 2 # 将 2 值压入栈中
SWVM_ADD # 对值进行相加,并将结果压入栈中
Finally, we embed the opcode bytes compiled into C / C + + in the way of uint8 array, and then compile them as a whole, as shown in the figure:
In the whole process, because our opcode design is private and private, and there is no plain JavaScript code, so the security has been greatly improved. In this way, we have solved the problems of "1", "2", "4" in the goal. But JavaScript has been reorganized into opcode, so how to ensure the "8" in the target? The solution is very simple. We attach relevant information to the key steps of compiling JavaScript into opcode, so that after code execution error, we can accurately report error according to relevant information. At the same time, we simplify the design of opcode, so that the volume of opcode generated is smaller than the original JavaScript code.
In addition to the language implementation and some standard libraries, duktape does not have some peripheral APIs, such as Ajax / websocket. Considering the convenience of use and easier to be received and used by front-end developers, we have implemented part of the webworker environment API for duktape, including websocket / console / Ajax, and the fetch / websocket provided by emscripten And so on.
So the final question is how can we reduce the size of the generated asm.js/wasm code? Without any processing, our generated code contains the implementation of duktape and many peripheral APIs. Even if a code of Hello world is gzip, it will be about 340kb in size. In order to solve this problem, we wrote the securityworker loader and compiled the generated code with the implementation of securityworker loader to get the final file. When the code is running, the securityworker loader releases the code that needs to be run and then executes it dynamically. In this way, we will reduce the size of the original code from the original gzip to about 340kb to about 180KB.
5.3 limitations
Securityworker solves many problems of the previous solution, but it is also not the most perfect solution. Because we have created another VM on webassembly, when your application is sensitive to volume or requires high execution efficiency, securityworker will not meet your requirements. Of course, securityworker can use a variety of optimization methods to greatly increase the volume size and efficiency on the current basis, but because it has reached our existing needs and goals, there is no relevant plan for improvement at present.
6, conclusion
By reviewing the current mainstream front-end core protection schemes and introducing the security worker scheme based on the former scheme in detail, we believe that you have a clear understanding of the whole front-end core algorithm protection technical scheme. Of course, there is no end to the pursuit of security, and securityworker is not the ultimate perfect solution. I hope that the relevant introduction of this article can enable more people to participate in the field of web assembly and front-end security, and make the web better.
Please send a private message for the author's wishes. Thank you