Docker network status
At present, docker network isolation is the weakest part of the containerization scheme. Neither the docker engine nor the major management systems such as mesos and k8s provide solutions. But if not isolated, there will be a certain risk, because if the flow of one container is too large, it will occupy the network card, causing other containers can not be used.
Is there any way to achieve network isolation in the current situation? In fact, there are flexible ways.
POC thinking and practice process
- First, we need to find a way to do it manually, and then we need to consider how to integrate with the container management system
- The docker engine can't provide relevant functions, so it's hard to start with the host Linux system
- The network model is the way of bridge, which can limit the flow by limiting the virtual network card of the container
At present, there are many tools for limiting applications or network cards on Linux. In the process of practice, the following are compared:
- TC: the traffic control tool under Linux, but the settings are too complex. In an age of advocating simplicity, it's just a sadism
- Trick, this command is mainly to limit the network traffic of the application, but it is not appropriate to use the network card first
- Wondershaper, this command is based on TC, but it's much simpler. You can directly limit the network card. That's it. However, as the tool is a little old, it also meets some holes, which are mentioned later.
- Installation of wondershaper requires Fedora repo (based on CentOS 7. X)
- Then Yum install wondershaper
yum install wondershaper
- Command: wondershaper < interface > < download rate > < upload rate > to limit traffic, in kbps
wondershaper <interface> <download-rate> <upload-rate>
With the tool, starting from limiting the container network, you need to find the virtual network card vethxxx corresponding to the host through the container ID. But the problem is that neither the API nor the command of docker can find the container to map to the host virtual network card. Later, I found a way through the communication with my acquaintance Yun Laoxiao:
- The first step is to obtain the sandboxkey corresponding to the container through the docker engine (docker inspection < containerid >), such as / var / run / docker / netns / 5b0e87d40fad
docker inspect <contianerid>
/var/run/docker/netns/5b0e87d40fad
- Step 2: get the peer fiindex of the container, that is, the value nsenter of the virtual network card number Veth ID of the corresponding container -- net = / var / run / docker / netns / 5b0e87d40fad ethtool - s eth0 | grep peer fiindex. Remember that the precondition is that the host has nsenter installed, and the container has ethtool tool, which has another pit, which is mentioned later
nsenter --net=/var/run/docker/netns/5b0e87d40fad ethtool -S eth0 |grep peer_ifindex
nsenter
ethtool
- The third step is to obtain the corresponding virtual network card IP a | grep < veh | ID >
ip a|grep <veth_id>
- In the fourth part, we can speed limit the container through wondershaper < interface > < download rate > < upload rate >
wondershaper <interface> <download-rate> <upload-rate>
- Step 5: test the effect. For simplicity, the test idea is to generate a large file such as 1g on the host (DD if = / dev / zero of = hello. TXT BS = 1g count = 1), and then get the file viewing rate WGet http://xxxx:8090/hello.txt in the container (start a simple HTTP service through Python: Python - M simplehttpserver 8090)
Step 5: test the effect. For simplicity, the test idea is to generate a large file such as 1g on the host (DD if = / dev / zero of = hello. TXT BS = 1g count = 1), and then get the file viewing rate WGet http://xxxx:8090/hello.txt in the container (start a simple HTTP service through Python: Python - M simplehttpserver 8090)
dd if=/dev/zero of=hello.txt bs=1G count=1
wget http://xxxx:8090/hello.txt
python -m SimpleHTTPServer 8090
- The first and second steps here are actually an old way. In fact, you can use docker exec < container_id > ethtool - s eth0 | grep peer_ifindex -:)
The first and second steps here are actually an old way. In fact, you can use docker exec < container_id > ethtool - s eth0 | grep peer_ifindex -:)
docker exec <container_id> ethtool -S eth0 |grep peer_ifindex
Some pits encountered
- Wondershaper doesn't work when it's set to more than 10000, but it turns out that the reason is that it's too old. But fortunately, it's a script command. Edit the command, find the place of 10mbit, and change it to 1000mbit or 10000mbit.
- Executing the above nsenter command always reports an error, the error is' invalid parameter ', but the command is correct. After searching for half a day's data, it was found that there was a problem with the configuration of the old version of docker's SYSTEMd. You need to modify the / usr / lib / SYSTEMd / system / docker.service file, remove mountflags = slave, and then restart the docker engine.
/usr/lib/systemd/system/docker.service
MountFlags=slave
Feasible overall plan
At this point, we can manually set the bandwidth of the container's network. However, in an integrated system (mesos / k8s), this manual method is not realistic, so it is necessary to design a scheme to simplify and automate it
Design ideas on mesos + Marathon:
- Deploy an agent to control the network rate on each host
- If marathon deploys the container, set the environment variables container? Upload? Rate, container? Download? Rate to set the best network rate value of the container
- The agent obtains the information of the local docker engine. When the container is started, it obtains the ID of the container and the environment variable of the above rate, and then executes the manual method like before to use wondershaper to speed limit
Advantages and disadvantages of the scheme:
- Can't report to mesos for allocation as a resource. In other words, even if the network compliance of the host is very high, mesos will deploy the container to the host
- The scheme at least achieves the network traffic isolation during the capacity period
- First, simple.
- The agent can also open ports to the management platform to control the rate of containers
Ultimate solution
We still need some + marathon to provide more suitable services. Just like CPU and memory, network card should also be a kind of manageable resource
Advertisement
Vipshop will recruit Java Architect / Senior Engineer / Senior Engineer / docker engineer, etc. if you want, you can send it to me: [email protected]