IMCAFS

Home

the realization of fingerprint identification system

Posted by trammel at 2020-02-27
all

outline

This series of articles will introduce the implementation process of fingerprint identification system of equipment. Firstly, the background, implementation principle and main application scenarios of device fingerprint are introduced. Then it will focus on the client information collection and related algorithm implementation. At the same time, we will analyze some mainstream website device fingerprint collection and its implementation logic. Then, it will give the specific usage of device fingerprint for specific business scenarios. Finally, a simple fingerprint identification system is built.

This paper will introduce the background, the principle and the application prospect of the device fingerprint.

Device fingerprint introduction

People's fingerprints are changeable and unique, which can be used as people's identity identification. At the same time, the person's name, ID card number and facial features can also be used as the only identification. For devices, they also have features that can be used for identification. For example, the unique serial number of the equipment, the production ID of the equipment, etc.

What is device fingerprinting?

In short, device fingerprint refers to the device characteristics or unique device identification that can be used to uniquely identify the device.

Device fingerprint includes some inherent, difficult to tamper with and unique device identification. For example, the hardware ID of a device, such as a mobile phone, will be given a unique IMEI (International Mobile Equipment Identity) number in the production process to uniquely identify the device. Like the network card of a computer, it will be given a unique MAC address in the production process. The unique identifiers of these devices can be regarded as device fingerprints.

At the same time, the feature set of device can be used as device fingerprint. We combine the name, model, shape, color, function and other features of the equipment for identification. This is similar to when we remember people, we usually remember them by their looks and facial features.

Device fingerprint background introduction

In the process of users visiting the website, the website needs to track what actions users have performed on the current page. For example, which keywords have been searched, which commodities have been browsed, how many times the user of the device has tried to log in today, how many accounts have been registered, etc.

1. Cookie based user tracking

HTTP is a stateless protocol. The generation of cookie technology enables users to store the relevant state in the local client, keep the session, and enable the server to know the current state of users. Usually, when visiting a website, the website generates a random unique ID and returns it to the client for storage. This identifier is carried by the client in every subsequent request. When users perform query, purchase, evaluation and other behaviors on the current page, the server will record the behaviors.

2. Persistent cookies

Aiming at the problem that many cookies will be cleaned up, in order to make the cookies more durable and used for user tracking, a technology like evercookie has been developed, which not only sets cookies, but also stores the value of cookies in flash local shared objects or HTML5 storage, indexdb, ie's UserData Storage and other locations even take advantage of Java security issues, such as cve-2013-0422 to bypass the applet sandbox, and then store the cookie in the user's local file. The specific implementation process of evercookie can be found in its [project document] (https://github.com/samyk/evercookie). This technology has been produced for a long time, and now the garbage cleaning can basically complete the cookie cleaning.

3. User tracking without cookies

Because cookies are stored in the client, they can be cleaned up. At the same time, the browser will provide stealth access mode, and cookies and other information will be automatically deleted when the browser page is closed. There will be privacy problems in cookie based user tracking, and cookie free technology emerges as the times require.

In 2009, Mayer (refer to his paper, any person... A pamphleter "Internet anonymity in the age of Web 2.0) and in 2010, Eckersley (refer to his paper, how unique is your web browser?) proposed browser based features and plug-in information, not relying on cookies for device identification.

Thus, the technology of device identification based on no cookie is produced, that is to say, we usually use device fingerprint for device identification and user tracking. The device fingerprint identification system is to collect a large number of device fingerprint information, locate the device through fingerprint information, so as to track its behavior in the website.

Implementation principle of device fingerprint

In our daily life, we distinguish people by their names. When we distinguish or remember people, we usually do it by their looks and facial features. The implementation principle of device fingerprint is similar. It obtains all kinds of information of the device, and then synthesizes all the features to form the identifier (i.e. device fingerprint) of the device according to certain algorithm to distinguish and identify the device.

Implementation process

Flow chart:

Explain:

The implementation process of device fingerprint is usually through the user's client, such as web browser, APP application, PC software, etc., to extract the user's device information, and calculate a unique ID according to a certain hash algorithm. At the same time, if a service contains business data, the device information is usually associated with the business data, and sent to the device fingerprint server for processing together Manage and store. The device fingerprint server receives the user's data and stores and associates it according to its own storage method and algorithm. Of course, because the user's device information may change, such as IP address, etc., it will be accompanied by multiple information collection and multiple device ID calculation.

In the subsequent business request, the algorithm of the device fingerprint identification system is used to identify and associate the device.

information acquisition

The realization of fingerprint identification system of equipment is bound to be accompanied by the collection of information. Mainly divided into active and passive information collection.

Passive information collection

When a network request is initiated, some basic attributes will be carried naturally. For example, the user's IP information, port, request header, user agent and other information.

Active information collection

It refers to executing specific code on the device to collect the required device information. For example, the MAC address, operating system version, CPU model and other information of the device are collected by code.

Among them, browser based device fingerprint acquisition is more common. The device fingerprint of browser usually needs some contents as follows. The device information is obtained by executing JS or flash code in browser. For example, user agent, time zone, screen size, browser plug-in, system installation font, features of browser canvas, features of browser Web GL, and so on. The specific implementation will be described in the following articles.

Application scenario of device fingerprint

1. Behavior tracking

User behavior tracking is mainly related to business. For example, a shopping website will collect the user's device information and recommend the user's products according to the device fingerprint information. Of course, the recommendation of similar products is only one aspect. It is more about forming device fingerprint for the collected information to provide users with better business services and security. For example, if the user's risk login is detected and the device is replaced, the user will be required to perform strong secondary authentication.

2. Advertising

It refers to the combination of user's search records, browsing records, etc., to record the equipment, and to push the advertisement targeted. In the case of clearing cookies, opening ad blocks, and prohibiting tracking, the technology based on device fingerprint identification can make ad push as accurate.

3. anti fraud

In fact, device fingerprint plays an important role in anti fraud risk control. Through the device fingerprint system, we can provide security for related businesses. For example, the current high-risk garbage registration, number theft, database collision, remote landing and other abnormal behaviors can be effectively controlled by using the device fingerprint identification system.

1) Click Fraud

Click fraud is mainly used in advertising cheating. Advertisers offer commissions based on the number of times a user clicks an ad. Fingerprint based identification technology can count the number of times that the same device clicks on the advertisement, find out the request of advertisement cheating, so as to reduce click fraud.

2) financial anti fraud

In the paper of "financial anti fraud - black industry chain of overseas credit card" (drops. Wooyun. Org / news / 14382), we simply mentioned that device fingerprint plays an important role in anti fraud of credit card and is an important dimension of risk control model. At the same time, in financial transactions, we need to ensure that the current transaction is on a credible device, which depends on the device fingerprint identification system.

3) anti pulling wool

Generally speaking, the scene of collecting wool is as follows: the wool party registers users in batches, and receives physical objects, coupons, cash back coupons, etc. Batch registration is usually accompanied by proxy IP switching, but the device may not change. So we can identify the device based on the fingerprint of the device, count the number of registered accounts and activities of the same device, and set up relevant prevention and control rules.

summary

Device fingerprint has a broad application prospect. With the birth of the Internet, there is a classic saying: "on the Internet, no one knows you are a dog." However, with the development of the Internet gradually becoming the second space of human beings, the Internet users can not identify their identity, can not self credit has greatly hindered the expansion of Internet business.

There is a mature credit system in the real society. For scenes with general security requirements, users may only need to use their ID card to obtain access, and for scenes with high risk (such as loans), they even need credit records to obtain access. So can the Internet scene build a mature credit system?

There are two ways to build the Internet credit system. One is to directly transfer the offline credit to the online. The other is that since the Internet has become a mature ecology, credit rating is directly based on the user's activity records on the Internet. And device fingerprint is the cornerstone of Internet reputation scoring.

At present, the risk prevention and control service based on device fingerprint has gradually matured. In foreign countries, threametrix is a relatively mature commercial company, which has a large amount of equipment information, and has its own equipment identification algorithm and equipment reputation rating. Domestic anti fraud services like Alibaba cloud reputation, based on Alibaba's big data, provide mature equipment reputation and risk control services through its own behavior collection and machine learning model.