Preface
In this year's different security conferences, everyone began to preach a mode of little concern in China - the concept of zero trust architecture. In the global field, the Google beyond Corp project is more cutting-edge in the practice of this concept. At present, my team has been learning and benchmarking the excellent security architecture solutions of Google & Amazon and other factories in various subdivisions, and has the honor to participate in and design the company's zero trust architecture project and implement it, and will continue to share some of my humble opinions without involving secrets.
The difference between beyondcorp and traditional enterprise security protection strategy
At a meeting of a security manufacturer this year, John kindervag, who proposed the concept of zero trust model, completely confused the beyondcorp project with John kindervag's zero trust model through my blind practitioners. They are the same in core concepts but the direction of practice is not the same at present. In the five papers of Google beyond Corp project, the concept of managed devices is widely cited, while John kindergarten focuses on promoting the total flow of untrustworthy, which is the same and different development path from my point of view.
The enterprise security construction I saw before contacting this concept focuses on all links from the outside to the inside, and uses a hierarchical way to mitigate the attack strength step by step, while the internal threat control is relatively weak. At that time, colleagues joked that it was called "prevention of gentlemen, prevention of small people". Learning about Google The beyondcorp project brings me a new look. Its purpose is to achieve borderless trusted access without trust in the intranet. Different from the trusted boundary division mechanism based on traditional enterprise security protection strategy, beyondcorp regards the request source as starting from the untrusted boundary by default, and replaces the traditional security system based on network boundary with "zero trust security" architecture Practice plan.
After abandoning the concept of security boundary, the control of users will focus on the fine particles of managed devices, network access, application access, etc., and the centralized access control authentication of fine particles will build a new enterprise security core. Beyondcorp manages identity, devices and resources in a unified way, and effectively controls access to all resources through a centralized authentication and authorization mechanism. Based on the real-time update of the user, equipment, status, resources and historical user behavior credibility information in the fingerprint database, this paper uses the dynamic multi round scoring mechanism to divide the trust level of the request source, and dynamically adjusts the access control rules through the decision engine according to the trust level, and defines the accessible area, so as to achieve the minimum authority within the level. The significant benefit point of this is that no matter from the office network or external network to access internal systems and services, the security department can turn the uncontrollable risks it may generate into controllable risks, further improve the security of employees and related equipment when accessing the internal network, so as to help enterprises to make flexible and precise trust decision-making control, so as to build a real Effective information security architecture.
Identify device security based on device fingerprint service
Figure 1: beyondcorp infrastructure
As shown in Figure 1, the structure of beyondcorp system is clearly shown. In the paper published by Google beyond Corp, the concept of "managed device" advocated in its project has been mentioned for many times. Only managed devices can access systems, applications or other resources within the company. Any information changes during the life cycle of the device will be reported to the device inventory service through the agent for storage and provided to other parts of beyondcorp for analysis. Google uses multiple data warehouses to store equipment information from different sources and types through meta inventory database for consolidation and normalization, and then provides the information to downstream components of beyondcorp after effective aggregation.
The user must use the device registered in the company system and managed continuously to pass the identity authentication, so how to identify the device and user has become the key to the problem. Since the era of PC, device identification is an important means to track users on the Internet. Traditional device identification means mainly rely on such dimensions as browser and sensor to generate unique identifier of the device through algorithm, so as to identify different devices. In recent years, from the point of view of confrontation, the way that the author prefers more is to score different multiple rounds through the homology of equipment information, and use more dimension data to participate in matching as much as possible. Through analysis and identification, each group of device information collected from the front-end is combined and given a unique device ID to identify the terminal device.
Device fingerprint is the first prerequisite of access control based on device fingerprint service, which distinguishes devices according to current device environment and security policy. After the basic information source of the host is established, other components of the agent can be opened as needed to improve security, coverage, granularity, delay time and flexibility. Beyondcorp defines the accessible area of the request source by dynamically pushing the multi-dimensional data such as the basic information source of the host stored in the fingerprint service of the device, the results of the baseline inspection, and the time difference between the last baseline inspection performed by the host to the credibility inference system in beyondcorp for linkage. The credibility inference system infers the trust level of the device corresponding to the request source And then output the scoring results to the access control engine to perform different access policies through radius scheduling and synchronize them to the gateway, VLAN, code warehouse and other internal information systems for restriction.
Figure 2: beyondcorp device fingerprint service
In beyondcorp's design, all managed devices need to be registered in the device fingerprint database. After the registration is completed and the authentication is successful, beyondcorp will push the X.509 certificate for each different device to the device as the persistent device identifier. The certificate can provide an encryption guid for the encryption gateway to use the certificate for all communication between the company and the service. Although the certificate can identify the uniqueness of the device, it will not be granted access to the device as an exception rule; on the contrary, the certificate information will also be a key subset of the device fingerprint. For example, if the certificate changes, the device will be considered as another different device, even if all other identifiers remain unchanged; if the certificate is installed on another different device, the agent related logic will report the certificate conflict and the inconsistency of the auxiliary identifier, and notify beyondcorp to reduce the trust level of the device. Therefore, having a certificate does not mean that the device is trusted or does not need relevant logic judgment to obtain access rights. The validity of the certificate storage will be verified during the equipment qualification process. Only the equipment considered as safe enough in the routine baseline inspection can be classified as the managed equipment. At the same time, the baseline inspection will be triggered when the certificate is updated regularly.
As shown in Figure 2, the device fingerprint service needs to extract data from each data source for continuous update. These sources shall at least include enterprise IT asset management system, human resource system, host basic information source, historical baseline inspection results, current and recent equipment status, network infrastructure elements (vulnerability scanning, certificate certification and ARP table), etc., and each data source shall be responsible for sending relevant complete or incremental updates to facilitate beyondcorp's tracking and management of other components Users in the database. These databases will be updated as employees join, change roles (positions), or leave the company. Based on this, when the user needs to access the internal system of the company, beyondcorp will have all the required information and make judgment. It is worth mentioning that many times in Google's public documents mention that retaining historical data is essential for understanding the end-to-end life cycle of specified devices, analyzing and tracking trends within the cluster, performing security audits and forensics.
Data processing flow of device fingerprint service
Figure 3: beyondcorp data processing flow
Device fingerprint service analyzes data from different sources to identify data conflicts, instead of blindly trusting one or several systems, and its internal data is always in the latest state. In Google's paper, the data are divided into two categories: observation data and specified indicators. The observation data refers to the data collected by the agent and other components and reported to the back-end system, such as the baseline inspection results, the timestamp of the latest synchronization policy, OS version and patch level, application list, etc.; while the specified index refers to the input data such as the enterprise IT asset management system and human resource system, such as the owner of equipment asset registration, DNS and DHCP allocation records, etc. The observed data are dynamic, while the specified indexes are usually static.
From the perspective of the whole life cycle of the device, hardware such as hard disk and motherboard may be replaced or even exchanged among the enterprise equipment assets, which makes the problem of equipment composition more complex. The implementation cost of effective data association is not big, but in this stage of practice, Google stated the difficulties encountered in the paper, mainly because the identifiers used by multiple data sources are not consistent. For example, use the asset ID and device serial number stored in the enterprise IT asset management system, use the disk encryption managed storage hard disk serial number, use the certificate authentication to store the certificate fingerprint, and record the MAC address in the ARP data record. The same device has several identifiers. When the variables from these systems increase in the device fingerprint database, it is impossible to determine whether the device information described is the same device. It takes a lot of effort to integrate these possible mutually exclusive fields into the same device fingerprint record. Once multiple records are merged into the same record, the confidence inference service will be triggered to re score to evaluate and reference the contents of different fields to determine the trust level. The credibility inference service needs to refer to platform specific fields and platform independent fields, involving various data sources. A device must meet at least all of the following requirements to be confirmed as a high level of trust: compliance with encryption compliance requirements, successful implementation of all baseline checks, installation of the latest OS security patches, and consistency of data from all back-end input sources.
It can effectively reduce the amount of data pushed to the gateway and the amount of computation when accessing the request, and ensure that the execution gateway uses the same data set. In this stage, trust changes can be made to idle devices, and the agent has denied access to devices with serious problems before making access requests. Another advantage of precomputation is that it provides an experimental environment, which can be used to perform specified tests during policy iteration and credibility inference service update, and to carry out gray-scale test verification with Canary model at a small scale, without affecting the company as a whole, which is one of the main means of testing in Internet companies. However, precomputation can not meet all use scenarios. For example, the action of verifying access requests will trigger the execution of access policies, requiring real-time two factor authentication credentials, which can be completed through the single sign on (SSO) system. However, when the policy or device status changes, the impact of the delay between the decision and the gateway can not be evaluated. Google's approach is to limit the update delay time to no more than one second; only part of the information will participate in the pre calculation; after the validation, SSO system will generate a short-term token to participate in part of the system authorization process.
The trust level of the specified device is finally determined by the trust inference service. The trust evaluation will consider the existing exceptions in the device fingerprint service, and may need to re-establish the general access policy. Exception rules are first and foremost a mechanism to reduce the deployment latency of policy changes or new policy primitives. In these cases, the expediency is to query before the scan task completes the update, to protect specific devices vulnerable to 0-day vulnerability attack, or to allow untrusted devices to connect to the boot network (minimum privilege, only used to restore trust level authentication). Internet of things devices, such as printers, need to be placed under the corresponding restricted trust level under exception rules, because the cost of installing and maintaining certificates for these devices is much higher than that for office devices.
Through the device fingerprint service, we can initiate verification queries for multiple data sources, and dynamically infer the trust level assigned to users or devices, so as to achieve fine-grained access control for a single user or device. As part of its decision-making process, the access control engine can use this trust level. For example, devices that are not updated with the latest operating system patch level may be downgraded to a lower level of trust; specific categories of devices (such as specific models of mobile phones or tablets) may be assigned a specific level of trust; users accessing applications from new locations may be assigned different levels of trust, using static rules and heuristics to determine these trusts Level.
The access control engine provides the authorization basis for each request for internal applications and services. The authorization decision is to associate the user's attributes from the user's device, device certificate and the device's artifacts. If necessary, the access control engine can also implement location-based access control. The inferred trust level of users and devices is also included in the authorization decision. For example, only Rd engineers using managed devices can access the internal code warehouse, while other engineers using managed devices can't access the code warehouse, associate the character attribute with the device data and visit The results of decision-making are affected in the process of question control. The access control engine is always dynamically provided by upstream and downstream components with data information to facilitate access decision-making, such as certificate whitelist, trust level relationship between devices and users, and more detailed information about devices and users.
Refined network access hierarchical access
On the premise of identifying devices and users, more refined network access is needed in the next step. When meeting the policy requirements of access control engine, the agent should have the ability to automatically open the encrypted tunnel to access the specific internal resources of the company, to open the internal application and workflow, to realize the access control based on the managed device and user connection, and to dynamically update the device and user information.
This step removes the trust to the network and places the trust level in the hierarchy, which is assigned to the device by the trust inferor system. The trust level of each resource is the lowest trust level required for access. In order to access the specified resource, the trust level allocation of the device must be equal to or higher than the minimum trust level requirement of the resource. For example, customer service for take out should be controlled at a lower level of access within the company, and can only access the relevant system of order recording, and does not need to access more sensitive services about the overall consumption process.
There are many advantages to specifying the lowest access level for a single request: it can reduce the maintenance cost of high-level security devices (first of all, it needs to invest in support and efficiency), and improve the availability of devices. Because the device can access more sensitive data, we need to analyze online users more frequently, so the more we trust the specified device, the shorter the validity of its certificate. Therefore, limiting the trust level of the device to the lowest level required for access can avoid users being interrupted in the use process. For example, it is necessary to install the latest version of the operating system for sensitive positions and update it in time to maintain a higher level of trust. For devices with a lower level of trust, this time requirement may be relatively loose. In addition to providing trust hierarchy allocation service, trust inferor can also divide the network by tagging VLANs that devices can access. Network segmentation allows us to restrict access to special networks based on device status. When a device becomes untrusted and is assigned to a recovery environment, the device can only access the limited resources for recovery authentication before recovery.
Written in the end
First of all, the sharing of beyondcorp is not finished. Even in Google, it's a big project that has been completed for more than four years. For more points that have no more details, we need to practice and mine before we know that they are unfinished. Welcome other friends in practice to communicate with me.
To be continued
- Official account WeChat group - brokers, taverns, quack
Official account WeChat group - brokers, taverns, quack
Previous articles (Security):
- Exploration and practice of mobile security in Internet companies
Exploration and practice of mobile security in Internet companies
- Debugging notes and detection scheme of virtual location plug-in without escape
Debugging notes and detection scheme of virtual location plug-in without escape
Previous articles (History):
- Concept of history 01 - history, life, installment and non payment
Concept of history 01 - history, life, installment and non payment
- Song dynasty-01: Five Dynasties, founding of the people's Republic of China, seizing power
Song dynasty-01: Five Dynasties, founding of the people's Republic of China, seizing power
- Song dynasty-02: circulation, military system, persistence, core
Song dynasty-02: circulation, military system, persistence, core