Migrate kubelet, docker and containerd working directories

Article directory

  • Problem background
  • migrate
    • Docker
      • Stop the Docker service
      • Change setting
      • Move files
      • Restart the Docker service
    • containerd
      • Out of service
      • Change setting
      • Move files
      • Restart service
    • kubelet (encountered problems to be solved)
      • Out of service
      • Change setting
      • Move files (encounter problems to be solved)
      • Restart service
  • version used

Problem background

The working directories of kubelet, docker and containerd are all under /var/lib by default.
However, the online machine rented by our school laboratory has a small disk space mounted at /, and a large data disk space mounted at /mnt/data_mnt/.
It should be because of the working directory. When / takes up more than 80%, kubelet will think that there is insufficient disk space and enter the NotReady state due to DiskPressure.

(The following is after migration)

root@iZhp3hqett0mw795req5b2Z:~# df -h | head
Filesystem Size Used Avail Use% Mounted on
udev 16G 0 16G 0% /dev
tmpfs 16G 19M 16G 1% /run
/dev/vda1 99G 48G 46G 51% /
tmpfs 16G 0 16G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 16G 0 16G 0% /sys/fs/cgroup
/dev/vdb1 493G 120G 348G 26% /mnt/data_mnt
overlay 99G 48G 46G 51% /var/lib/containers/storage/overlay/54a47bbff1442f521326770cab94eb3221d82b0ff9e997c1b2efe6cad811b21b/merged
overlay 99G 48G 46G 51% /var/lib/containers/storage/overlay/a74d553e701c85c5ad25fd14a8fd30383e0dc21f4b567bc81e6b7ac74bc73524/merged

Migration

Docker

Stop the Docker service

After deleting all containers.

systemctl stop docker

Modify configuration

The Docker configuration file is in /etc/docker/daemon.json, add a field to set the data directory.

Refer to the official website documentation https://docs.docker.com/config/daemon/#daemon-data-directory

Modified example:

{<!-- -->
 "registry-mirrors": [
     "https://dockerhub.azk8s.cn",
     "https://hub-mirror.c.163.com",
     "https://reg-mirror.qiniu.com"
 ],

   "builder": {<!-- -->
       "gc": {<!-- -->
         "defaultKeepStorage": "20GB",
         "enabled": true
       }
   },
   "experimental": true,
   "features": {<!-- -->
     "buildkit": false
   },
   "dns": ["8.8.8.8", "8.8.4.4"],
   "data-root": "/mnt/data_mnt/var/lib/docker"
}

Move files

Copy /var/lib/docker to /mnt/data_mnt/var/lib/docker

Restart the Docker service

systemctl start docker

# Run nginx and see
docker run -p 80:80 nginx

# Check service status
systemctl status docker


● docker.service - Docker Application Container Engine
   Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2023-10-17 22:28:47 CST; 12h ago
     Docs: https://docs.docker.com
 Main PID: 3917580 (dockerd)
    Tasks: 25
   Memory: 1.0G
      CPU: 1min 16.247s
   CGroup: /system.slice/docker.service
           ├─ 370428 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 5050 -container-ip 172.17.0.2 -container-port 5000
           └─3917580 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock

Oct 18 10:11:41 iZhp3hqett0mw795req5b2Z dockerd[3917580]: time="2023-10-18T10:11:41.715286425 + 08:00" level=error msg="Handler for POST /v1.41/containers/f66c7e907176 ccd2abe010253448ab6dcab286c60f893b4cde72184215747d90 /start returned error: driver
Oct 18 10:17:18 iZhp3hqett0mw795req5b2Z dockerd[3917580]: time="2023-10-18T10:17:18.451142888 + 08:00" level=info msg="Attempting next endpoint for push after error: Get \ "https://localhost:5000/v2/": http: server gave HTTP response to HTTPS client
Oct 18 10:17:18 iZhp3hqett0mw795req5b2Z dockerd[3917580]: time="2023-10-18T10:17:18.455921606 + 08:00" level=error msg="Upload failed: no basic auth credentials"
Oct 18 10:17:18 iZhp3hqett0mw795req5b2Z dockerd[3917580]: time="2023-10-18T10:17:18.455953643 + 08:00" level=error msg="Upload failed: no basic auth credentials"
Oct 18 10:17:18 iZhp3hqett0mw795req5b2Z dockerd[3917580]: time="2023-10-18T10:17:18.455930600 + 08:00" level=error msg="Upload failed: no basic auth credentials"
Oct 18 10:17:18 iZhp3hqett0mw795req5b2Z dockerd[3917580]: time="2023-10-18T10:17:18.456010183 + 08:00" level=error msg="Upload failed: no basic auth credentials"
Oct 18 10:17:18 iZhp3hqett0mw795req5b2Z dockerd[3917580]: time="2023-10-18T10:17:18.456058582 + 08:00" level=info msg="Attempting next endpoint for push after error: no basic auth credentials"
Oct 18 10:18:56 iZhp3hqett0mw795req5b2Z dockerd[3917580]: time="2023-10-18T10:18:56.354196507 + 08:00" level=info msg="Attempting next endpoint for push after error: Get \ "https://localhost:5050/v2/": http: server gave HTTP response to HTTPS client
Oct 18 10:19:02 iZhp3hqett0mw795req5b2Z dockerd[3917580]: time="2023-10-18T10:19:02.439702060 + 08:00" level=info msg="Attempting next endpoint for push after error: Get \ "https://localhost:5050/v2/": http: server gave HTTP response to HTTPS client
Oct 18 10:19:07 iZhp3hqett0mw795req5b2Z dockerd[3917580]: time="2023-10-18T10:19:07.267669420 + 08:00" level=info msg="Attempting next endpoint for push after error: Get \ "https://localhost:5050/v2/": http: server gave HTTP response to HTTPS client

containerd

Stop service

systemctl stop containerd

Modify configuration

The configuration file is in /etc/containerd/config.toml.

You can see root = "/mnt/data_mnt/var/lib/containerd", and you can see that the working directory defaults to /var/lib/containerd

In case of accidental changes, you can regenerate the default configuration:

containerd config default > /etc/containerd/config.toml

For example after modification:

version = 2
root = "/mnt/data_mnt/var/lib/containerd"
state = "/run/containerd"
oom_score = 0

[grpc]
  address = "/run/containerd/containerd.sock"
  uid = 0
  gid = 0
  max_recv_message_size = 16777216
  max_send_message_size = 16777216

[debug]
  address = "/run/containerd/containerd-debug.sock"
  uid = 0
  gid = 0
  level = "warn"

[timeouts]
  "io.containerd.timeout.shim.cleanup" = "5s"
  "io.containerd.timeout.shim.load" = "5s"
  "io.containerd.timeout.shim.shutdown" = "3s"
  "io.containerd.timeout.task.state" = "2s"

[plugins]
  [plugins."io.containerd.grpc.v1.cri"]
    sandbox_image = "sealos.hub:5000/pause:3.9"
    max_container_log_line_size = -1
    max_concurrent_downloads = 20
    disable_apparmor = false
    [plugins."io.containerd.grpc.v1.cri".containerd]
      snapshotter = "overlayfs"
      default_runtime_name = "runc"
      [plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
        [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
          runtime_type = "io.containerd.runc.v2"
          runtime_engine = ""
          runtime_root = ""
          [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
            SystemdCgroup = true
    [plugins."io.containerd.grpc.v1.cri".registry]
      config_path = "/etc/containerd/certs.d"
      [plugins."io.containerd.grpc.v1.cri".registry.configs]
          [plugins."io.containerd.grpc.v1.cri".registry.configs."sealos.hub:5000".auth]
            username = "admin"
            password = "passw0rd"

Move files

Copy /mnt/data_mnt/var/lib/containerd to /var/lib/containerd

Restart the service

systemctl start containerd

systemctl status containerd

kubelet (encountered problems to be solved)

Stop service

systemctl stop kubelet

Modify configuration

The configuration of the kubelet service, my configuration is in /etc/systemd/system/kubelet.service.d/10-kubeadm.conf.

Note that there may be a file /etc/systemd/system/kubelet.service.d/override.conf in the same directory. In actual operation, override.conf will be used to overwrite the contents of 10-kubeadm.conf.

Example of modified content:

# Note: This dropin only works with kubeadm and kubelet v1.11 +
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/mnt/data_mnt/var/lib/kubelet/config.yaml"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/mnt/data_mnt/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
Environment="KUBELET_EXTRA_ARGS= \
               \
               \
              --runtime-request-timeout=15m --container-runtime-endpoint=unix:///var/run/containerd/containerd.sock --image-service-endpoint=unix:///var/run/image- cri-shim.sock"
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS

In addition, you also need to modify the key address configured in /etc/kubernetes/kubelet.conf. Modified example (part)

# The above is omitted
users:
- name: system:node:izhp3hqett0mw795req5b2z
  user:
    client-certificate: /mnt/data_mnt/var/lib/kubelet/pki/kubelet-client-current.pem
    client-key: /mnt/data_mnt/var/lib/kubelet/pki/kubelet-client-current.pem

In addition, a soft link must be built, because when reading the key, the actual specific version of the key is found through the soft link named “current”. After moving, it will be messed up.

ln -s kubelet-client-2023-10-07-11-14-02.pem kubelet-client-current.pem

Move files (encounter problems to be solved)

Some files cannot be deleted…

root@iZhp3hqett0mw795req5b2Z:~# rm -rf /var/lib/kubelet
rm: cannot remove '/var/lib/kubelet/pods/30c0099f-dfcc-4e6f-893e-eacc6ed44021/volumes/kubernetes.io~projected/kube-api-access-6jt8n': Device or resource busy
rm: cannot remove '/var/lib/kubelet/pods/30c0099f-dfcc-4e6f-893e-eacc6ed44021/volumes/kubernetes.io~empty-dir/tmp-volume': Device or resource busy
rm: cannot remove '/var/lib/kubelet/pods/54e7cb22-fdab-4e33-afb3-c8ba88d153a2/volumes/kubernetes.io~projected/kube-api-access-j84xs': Device or resource busy
rm: cannot remove '/var/lib/kubelet/pods/d1a3fba3-3ab8-4ef9-b61c-6479b26c79f7/volumes/kubernetes.io~projected/kube-api-access-lf5tx': Device or resource busy
rm: cannot remove '/var/lib/kubelet/pods/5e38f3a0-7f59-4d2e-98f4-1ec915e6ba89/volumes/kubernetes.io~projected/kube-api-access-prz4v': Device or resource busy
rm: cannot remove '/var/lib/kubelet/pods/0f02517c-01c3-4b58-9f85-be169a92a31d/volumes/kubernetes.io~projected/kube-api-access-r4kxp': Device or resource busy
rm: cannot remove '/var/lib/kubelet/pods/7098d438-0a9d-40df-aee1-ec4884ba262f/volumes/kubernetes.io~projected/kube-api-access-rqtwq': Device or resource busy

Restart the service

systemctl start kubelet

systemctl status kubelet

Version used

Date: October 18, 2023

Version

root@iZhp3hqett0mw795req5b2Z:~# kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version.
Client Version: version.Info{<!-- -->Major:"1", Minor:"27", GitVersion:"v1.27.3", GitCommit:"25b4e43193bcda6c7328a6d147b1fb73a33f1598", GitTreeState: "clean", BuildDate:"2023-06-14T09:53:42Z", GoVersion:"go1.20.5", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v5.0.1
Server Version: version.Info{<!-- -->Major:"1", Minor:"27", GitVersion:"v1.27.3", GitCommit:"25b4e43193bcda6c7328a6d147b1fb73a33f1598", GitTreeState: "clean", BuildDate:"2023-06-14T09:47:40Z", GoVersion:"go1.20.5", Compiler:"gc", Platform:"linux/amd64"}

root@iZhp3hqett0mw795req5b2Z:~# docker version
Client:
 Version: 20.10.21
 API version: 1.41
 Go version: go1.18.1
 Git commit: 20.10.21-0ubuntu1~18.04.3
 Built: Thu Apr 27 05:50:21 2023
 OS/Arch: linux/amd64
 Context:default
 Experimental: true

Server:
 Engine:
  Version: 20.10.21
  API version: 1.41 (minimum version 1.12)
  Go version: go1.18.1
  Git commit: 20.10.21-0ubuntu1~18.04.3
  Built: Thu Apr 27 05:36:22 2023
  OS/Arch: linux/amd64
  Experimental: true
 containerd:
  Version: 1.6.12-0ubuntu1~18.04.1
  GitCommit:
 runc:
  Version: 1.1.4-0ubuntu1~18.04.2
  GitCommit:
 docker-init:
  Version: 0.19.0
  GitCommit:

root@iZhp3hqett0mw795req5b2Z:~# containerd --version
containerd github.com/containerd/containerd 1.6.12-0ubuntu1~18.04.1