Article directory
- Problem background
- migrate
-
- Docker
-
- Stop the Docker service
- Change setting
- Move files
- Restart the Docker service
- containerd
-
- Out of service
- Change setting
- Move files
- Restart service
- kubelet (encountered problems to be solved)
-
- Out of service
- Change setting
- Move files (encounter problems to be solved)
- Restart service
- version used
Problem background
The working directories of kubelet, docker and containerd are all under /var/lib by default.
However, the online machine rented by our school laboratory has a small disk space mounted at /
, and a large data disk space mounted at /mnt/data_mnt/
.
It should be because of the working directory. When /
takes up more than 80%, kubelet will think that there is insufficient disk space and enter the NotReady state due to DiskPressure.
(The following is after migration)
root@iZhp3hqett0mw795req5b2Z:~# df -h | head Filesystem Size Used Avail Use% Mounted on udev 16G 0 16G 0% /dev tmpfs 16G 19M 16G 1% /run /dev/vda1 99G 48G 46G 51% / tmpfs 16G 0 16G 0% /dev/shm tmpfs 5.0M 0 5.0M 0% /run/lock tmpfs 16G 0 16G 0% /sys/fs/cgroup /dev/vdb1 493G 120G 348G 26% /mnt/data_mnt overlay 99G 48G 46G 51% /var/lib/containers/storage/overlay/54a47bbff1442f521326770cab94eb3221d82b0ff9e997c1b2efe6cad811b21b/merged overlay 99G 48G 46G 51% /var/lib/containers/storage/overlay/a74d553e701c85c5ad25fd14a8fd30383e0dc21f4b567bc81e6b7ac74bc73524/merged
Migration
Docker
Stop the Docker service
After deleting all containers.
systemctl stop docker
Modify configuration
The Docker configuration file is in /etc/docker/daemon.json
, add a field to set the data directory.
Refer to the official website documentation https://docs.docker.com/config/daemon/#daemon-data-directory
Modified example:
{<!-- --> "registry-mirrors": [ "https://dockerhub.azk8s.cn", "https://hub-mirror.c.163.com", "https://reg-mirror.qiniu.com" ], "builder": {<!-- --> "gc": {<!-- --> "defaultKeepStorage": "20GB", "enabled": true } }, "experimental": true, "features": {<!-- --> "buildkit": false }, "dns": ["8.8.8.8", "8.8.4.4"], "data-root": "/mnt/data_mnt/var/lib/docker" }
Move files
Copy /var/lib/docker
to /mnt/data_mnt/var/lib/docker
Restart the Docker service
systemctl start docker # Run nginx and see docker run -p 80:80 nginx # Check service status systemctl status docker ● docker.service - Docker Application Container Engine Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2023-10-17 22:28:47 CST; 12h ago Docs: https://docs.docker.com Main PID: 3917580 (dockerd) Tasks: 25 Memory: 1.0G CPU: 1min 16.247s CGroup: /system.slice/docker.service ├─ 370428 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 5050 -container-ip 172.17.0.2 -container-port 5000 └─3917580 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock Oct 18 10:11:41 iZhp3hqett0mw795req5b2Z dockerd[3917580]: time="2023-10-18T10:11:41.715286425 + 08:00" level=error msg="Handler for POST /v1.41/containers/f66c7e907176 ccd2abe010253448ab6dcab286c60f893b4cde72184215747d90 /start returned error: driver Oct 18 10:17:18 iZhp3hqett0mw795req5b2Z dockerd[3917580]: time="2023-10-18T10:17:18.451142888 + 08:00" level=info msg="Attempting next endpoint for push after error: Get \ "https://localhost:5000/v2/": http: server gave HTTP response to HTTPS client Oct 18 10:17:18 iZhp3hqett0mw795req5b2Z dockerd[3917580]: time="2023-10-18T10:17:18.455921606 + 08:00" level=error msg="Upload failed: no basic auth credentials" Oct 18 10:17:18 iZhp3hqett0mw795req5b2Z dockerd[3917580]: time="2023-10-18T10:17:18.455953643 + 08:00" level=error msg="Upload failed: no basic auth credentials" Oct 18 10:17:18 iZhp3hqett0mw795req5b2Z dockerd[3917580]: time="2023-10-18T10:17:18.455930600 + 08:00" level=error msg="Upload failed: no basic auth credentials" Oct 18 10:17:18 iZhp3hqett0mw795req5b2Z dockerd[3917580]: time="2023-10-18T10:17:18.456010183 + 08:00" level=error msg="Upload failed: no basic auth credentials" Oct 18 10:17:18 iZhp3hqett0mw795req5b2Z dockerd[3917580]: time="2023-10-18T10:17:18.456058582 + 08:00" level=info msg="Attempting next endpoint for push after error: no basic auth credentials" Oct 18 10:18:56 iZhp3hqett0mw795req5b2Z dockerd[3917580]: time="2023-10-18T10:18:56.354196507 + 08:00" level=info msg="Attempting next endpoint for push after error: Get \ "https://localhost:5050/v2/": http: server gave HTTP response to HTTPS client Oct 18 10:19:02 iZhp3hqett0mw795req5b2Z dockerd[3917580]: time="2023-10-18T10:19:02.439702060 + 08:00" level=info msg="Attempting next endpoint for push after error: Get \ "https://localhost:5050/v2/": http: server gave HTTP response to HTTPS client Oct 18 10:19:07 iZhp3hqett0mw795req5b2Z dockerd[3917580]: time="2023-10-18T10:19:07.267669420 + 08:00" level=info msg="Attempting next endpoint for push after error: Get \ "https://localhost:5050/v2/": http: server gave HTTP response to HTTPS client
containerd
Stop service
systemctl stop containerd
Modify configuration
The configuration file is in /etc/containerd/config.toml
.
You can see root = "/mnt/data_mnt/var/lib/containerd"
, and you can see that the working directory defaults to /var/lib/containerd
In case of accidental changes, you can regenerate the default configuration:
containerd config default > /etc/containerd/config.toml
For example after modification:
version = 2 root = "/mnt/data_mnt/var/lib/containerd" state = "/run/containerd" oom_score = 0 [grpc] address = "/run/containerd/containerd.sock" uid = 0 gid = 0 max_recv_message_size = 16777216 max_send_message_size = 16777216 [debug] address = "/run/containerd/containerd-debug.sock" uid = 0 gid = 0 level = "warn" [timeouts] "io.containerd.timeout.shim.cleanup" = "5s" "io.containerd.timeout.shim.load" = "5s" "io.containerd.timeout.shim.shutdown" = "3s" "io.containerd.timeout.task.state" = "2s" [plugins] [plugins."io.containerd.grpc.v1.cri"] sandbox_image = "sealos.hub:5000/pause:3.9" max_container_log_line_size = -1 max_concurrent_downloads = 20 disable_apparmor = false [plugins."io.containerd.grpc.v1.cri".containerd] snapshotter = "overlayfs" default_runtime_name = "runc" [plugins."io.containerd.grpc.v1.cri".containerd.runtimes] [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc] runtime_type = "io.containerd.runc.v2" runtime_engine = "" runtime_root = "" [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options] SystemdCgroup = true [plugins."io.containerd.grpc.v1.cri".registry] config_path = "/etc/containerd/certs.d" [plugins."io.containerd.grpc.v1.cri".registry.configs] [plugins."io.containerd.grpc.v1.cri".registry.configs."sealos.hub:5000".auth] username = "admin" password = "passw0rd"
Move files
Copy /mnt/data_mnt/var/lib/containerd
to /var/lib/containerd
Restart the service
systemctl start containerd systemctl status containerd
kubelet (encountered problems to be solved)
Stop service
systemctl stop kubelet
Modify configuration
The configuration of the kubelet service, my configuration is in /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
.
Note that there may be a file /etc/systemd/system/kubelet.service.d/override.conf
in the same directory. In actual operation, override.conf will be used to overwrite the contents of 10-kubeadm.conf.
Example of modified content:
# Note: This dropin only works with kubeadm and kubelet v1.11 + [Service] Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf" Environment="KUBELET_CONFIG_ARGS=--config=/mnt/data_mnt/var/lib/kubelet/config.yaml" # This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically EnvironmentFile=-/mnt/data_mnt/var/lib/kubelet/kubeadm-flags.env # This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use # the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file. Environment="KUBELET_EXTRA_ARGS= \ \ \ --runtime-request-timeout=15m --container-runtime-endpoint=unix:///var/run/containerd/containerd.sock --image-service-endpoint=unix:///var/run/image- cri-shim.sock" ExecStart= ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
In addition, you also need to modify the key address configured in /etc/kubernetes/kubelet.conf
. Modified example (part)
# The above is omitted users: - name: system:node:izhp3hqett0mw795req5b2z user: client-certificate: /mnt/data_mnt/var/lib/kubelet/pki/kubelet-client-current.pem client-key: /mnt/data_mnt/var/lib/kubelet/pki/kubelet-client-current.pem
In addition, a soft link must be built, because when reading the key, the actual specific version of the key is found through the soft link named “current”. After moving, it will be messed up.
ln -s kubelet-client-2023-10-07-11-14-02.pem kubelet-client-current.pem
Move files (encounter problems to be solved)
Some files cannot be deleted…
root@iZhp3hqett0mw795req5b2Z:~# rm -rf /var/lib/kubelet rm: cannot remove '/var/lib/kubelet/pods/30c0099f-dfcc-4e6f-893e-eacc6ed44021/volumes/kubernetes.io~projected/kube-api-access-6jt8n': Device or resource busy rm: cannot remove '/var/lib/kubelet/pods/30c0099f-dfcc-4e6f-893e-eacc6ed44021/volumes/kubernetes.io~empty-dir/tmp-volume': Device or resource busy rm: cannot remove '/var/lib/kubelet/pods/54e7cb22-fdab-4e33-afb3-c8ba88d153a2/volumes/kubernetes.io~projected/kube-api-access-j84xs': Device or resource busy rm: cannot remove '/var/lib/kubelet/pods/d1a3fba3-3ab8-4ef9-b61c-6479b26c79f7/volumes/kubernetes.io~projected/kube-api-access-lf5tx': Device or resource busy rm: cannot remove '/var/lib/kubelet/pods/5e38f3a0-7f59-4d2e-98f4-1ec915e6ba89/volumes/kubernetes.io~projected/kube-api-access-prz4v': Device or resource busy rm: cannot remove '/var/lib/kubelet/pods/0f02517c-01c3-4b58-9f85-be169a92a31d/volumes/kubernetes.io~projected/kube-api-access-r4kxp': Device or resource busy rm: cannot remove '/var/lib/kubelet/pods/7098d438-0a9d-40df-aee1-ec4884ba262f/volumes/kubernetes.io~projected/kube-api-access-rqtwq': Device or resource busy
Restart the service
systemctl start kubelet systemctl status kubelet
Version used
Date: October 18, 2023
Version
root@iZhp3hqett0mw795req5b2Z:~# kubectl version WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version. Client Version: version.Info{<!-- -->Major:"1", Minor:"27", GitVersion:"v1.27.3", GitCommit:"25b4e43193bcda6c7328a6d147b1fb73a33f1598", GitTreeState: "clean", BuildDate:"2023-06-14T09:53:42Z", GoVersion:"go1.20.5", Compiler:"gc", Platform:"linux/amd64"} Kustomize Version: v5.0.1 Server Version: version.Info{<!-- -->Major:"1", Minor:"27", GitVersion:"v1.27.3", GitCommit:"25b4e43193bcda6c7328a6d147b1fb73a33f1598", GitTreeState: "clean", BuildDate:"2023-06-14T09:47:40Z", GoVersion:"go1.20.5", Compiler:"gc", Platform:"linux/amd64"} root@iZhp3hqett0mw795req5b2Z:~# docker version Client: Version: 20.10.21 API version: 1.41 Go version: go1.18.1 Git commit: 20.10.21-0ubuntu1~18.04.3 Built: Thu Apr 27 05:50:21 2023 OS/Arch: linux/amd64 Context:default Experimental: true Server: Engine: Version: 20.10.21 API version: 1.41 (minimum version 1.12) Go version: go1.18.1 Git commit: 20.10.21-0ubuntu1~18.04.3 Built: Thu Apr 27 05:36:22 2023 OS/Arch: linux/amd64 Experimental: true containerd: Version: 1.6.12-0ubuntu1~18.04.1 GitCommit: runc: Version: 1.1.4-0ubuntu1~18.04.2 GitCommit: docker-init: Version: 0.19.0 GitCommit: root@iZhp3hqett0mw795req5b2Z:~# containerd --version containerd github.com/containerd/containerd 1.6.12-0ubuntu1~18.04.1