IMCAFS

Home

tidb at fengchao: try distributed database

Posted by barello at 2020-03-19
all

background

With the rapid growth of Fengchao business system, the data volume of its core system has already exceeded billion levels, and the annual increment is still developing rapidly. As the pressure of data volume increases, not only the system architecture complexity increases sharply, but also the data architecture becomes more complex. The traditional single node database has gradually failed to meet the needs of Fengchao. When the number of single tables is more than 100 million, Oracle can barely resist it, while MySQL is difficult to support when it reaches the level of 10 million, so it needs to separate tables and databases. For this reason, a high-performance distributed database is becoming an urgent need.

Reflection

After the increase of business volume of Internet companies, parallel expansion is the most common, simple and real-time means. For example, the load balancing equipment can break down the traffic, make the massive traffic become a small amount of traffic that each machine can bear, and support the whole business through clustering and other ways. So when the database can not afford to split.

But stateful data is different from stateless data. When data is split, data partition will occur, and the whole system will be in high availability state, so data consistency becomes a victim. A large number of checking tools run between systems to ensure the final consistency. In terms of business, the business students often meet the students who have shared the database and say that they can't do this demand and that they can't do that demand. If the business students who have SQL experience may have questions, it's not just a matter of SQL, in fact, it's the sequelae of sub database and sub table.

Therefore, we need a database to help us solve the above problems. Its features should be:

Strong data consistency: support complete acid

No matter how much data we insert, we just don't need to worry about when to expand the capacity. Will there be a bottleneck

High availability of data: when a few machine disks or other parts of a database are hung, our business can be imperceptible, and even in case of a disaster in a city's computer room, we can continue to provide services without losing data.

Complex SQL function: basically, single database SQL can be run on this database without modification or a little modification

High performance: it can meet the requirements of high QPS and guarantee relatively low delay.

model selection

Based on the analysis of the above expectations, we analyzed the newsql distributed database currently on the market. The list is as follows:

After considering the open source protocol, maturity, controllability, performance, service support and other comprehensive factors, we chose tidb. Its main advantages are as follows:

Highly compatible with MySQL

In most cases, you can easily migrate from Mysql to tidb without modifying the code. MySQL clusters after database and table splitting can also be migrated in real time through tidb tools.     

Horizontal elastic expansion

Simply adding new nodes can realize the horizontal expansion of tidb, expand throughput or storage on demand, and easily cope with high concurrency and massive data scenarios.

Distributed transaction

Tidb 100% supports standard acid transactions

High availability at financial level

Compared with the traditional master-slave (M-S) replication scheme, the majority election protocol based on raft can provide 100% strong data consistency guarantee at the financial level, and can realize auto failover without manual intervention without losing most copies.

Based on the above reasons, we chose tidb as the distributed database of Fengchao's core system to replace Oracle and mysql.

Assessment

1. Performance test

For the benchmark test of tidb, sysbanch is used for testing. Eight tables with basic data of 10 million are used. Insert, select, OLTP and delete scripts are tested respectively to get the data as follows. The QPS of query reaches an amazing 140000 seconds, and the insertion is stable at 140000 seconds.

Core server configuration

test result

Through ~

2. Function test

Through ~

Access

Because it's the core system, we have adopted a variety of schemes to ensure the reliability of the access of the verification project and ensure that the business will not be affected

1. Project selection

When looking for the first access project, we selected it with the following four characteristics

Finally, we chose the push service. Because push service is the core service used by Fengchao to send pick-up notifications, which is very large in volume, but simple in logic, and has alternative external push schemes, even if there is a problem, it will not affect users.

2. Code modification

Because tidb is fully compatible with MySQL syntax, our code changes are very subtle in the access process of this project. SQL is basically zero change, mainly peripheral code, including:

Asynchronous interface modification and asynchronous data warehousing.

Synchronous interface modification to achieve abnormal fusing.

Stop embedded data migration code.

The above three points ensure that the whole system is not strongly dependent on the database, and can protect the database from being crushed by asynchronous database dropping in the case of high concurrency. In addition, when the database has problems, the core business can go on normally.

Effect

1. Query ability

After accessing tidb, a dozen sub tables originally split according to time dimension have become a large table. The most obvious change is that under the condition of large amount of data, the data query ability has improved significantly.

2. Monitoring capability

Tidb has a very complete monitoring platform, which can visually see the capacity and node status.  

You can also understand the load of each node and the delay of SQL execution

Of course, you can also know the location of the machine, the CPU memory and other load conditions

The network status can also be clearly monitored

All of these enable the team to analyze the SQL in question and the database itself.

Summary

The access process of tidb is very smooth as a whole. Due to a lot of previous access guarantee work, the process of switching traffic to tidb took only 10 minutes on the same day. I also want to thank tidb for its support for MySQL syntax compatibility and various useful tools provided by pingcap. So far, the system has been running stably for more than one month, which meets the business requirements of Fengchao well.

After the completion of tidb transformation, Fengchao push service has landed and queried most of the messages. Up to now, the largest number of sunset areas of push service has reached 50 million. If the push service still uses MySQL solution, it needs various database and table splitting schemes. Many detailed businesses cannot or are difficult to carry out.

The transformation of tidb is just a small step for Fengchao to explore distributed data technology. In the future, Fengchao will introduce more distributed technology into more business systems and create more extreme products and services.