iT邦幫忙

2

五、k8S 遇到問題與排除

DK 2018-08-22 15:40:587928 瀏覽

問題探討與解決

問題一

# tail -f /var/log/messages   
Failed to get system container stats for "/system.slice/kubelet.service": failed to get cgroup stats for "/system.slice/kubelet.service": failed to get container info for "/system.slice/kubelet.service": unknown container "/system.slice/kubelet.service"
Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get container info for "/system.slice/docker.service": unknown container "/system.slice/docker.service"
vim  /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

發生以上的錯誤,可用以下的辦法解決

kubelet 1.10
vim /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
找到KUBELET_CGROUP_ARGS=--cgroup-driver=systemd 後面增加--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice
Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=systemd --runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice"
存檔裡開
systemctl daemon-reload
systemctl restart kubelet
以上問題為Kubernetes版本與docker版本不相容導致cgroup功能失效 原文網址:https://itw01.com/2ZZ5ESH.html

kubelet 1.11以後
#vim /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
加上
Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=systemd --runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice"

找到ExecStart 最後面增加 $KUBELET_CGROUP_ARGS
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS $KUBELET_CGROUP_ARGS

問題二

出現以下錯誤,全部主機都要設定

# tail -f /var/log/messages
CPUAccounting not enabled for pid: 23283
MemoryAccounting not enabled for pid: 23283
解決方法
# systemctl show docker | grep Accounting  #show出現都是CPU=no Memory=on
# systemctl set-property docker.service MemoryAccounting=yes   #把memory開啟
# systemctl set-property docker.service CPUAccounting=yes         #把CPU開啟
# systemctl show docker | grep Accounting
# grep -Ri accounting /etc/systemd/

問題三

master init失敗 (kubeadm init --config /etc/kubernetes/config.yaml)

# kubeadm reset
# rm -rf /var/lib/etcd/*

再重新init

# kubeadm init --config /etc/kubernetes/config.yaml

問題四

# kubectl get pods --all-namespaces

遇到 coredns ContainerCreating及kube-dns消失 及 kube-flannel CrashLoopBackOff

# echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.conf ; sysctl -p

還要有gateway喔

問題五

監控hpa CPU使用量監控安裝
https://github.com/kubernetes-incubator/metrics-server

# wget https://github.com/kubernetes-incubator/metrics-server

解壓縮後

# kubectl create -f metrics-server/deploy/1.8+/

hpa就可以看到deploy的CPU使用量

假如過幾分鐘沒看到CPU使用%,出現未知,在往下做
metrics server yaml檔問題
遇到metrics無法起來 找不到 10255 port 解決方法 只有k8s 1.11會遇到的問題

# kubectl -n kube-system edit deploy metrics-server
找到
.
.
.
containers:
    - name: metrics-server
        image: gcr.io/google_containers/metrics-server-amd64:v0.2.1
        imagePullPolicy: Always
        volumeMounts:
        - name: tmp-dir
            mountPath: /tmp
        command:
        - /metrics-server
        - --source=kubernetes.summary_api:''
.
.
.

找到 - --source=kubernetes 如果沒有 補上command:和- /metrics-server和- --source=kubernetes.summary_api:這幾行
--source=kubernetes.summary_api:後面增加 https://kubernetes.default?kubeletHttps=true&kubeletPort=10250&insecure=true

.
.
.
containers:
    - name: metrics-server
        image: gcr.io/google_containers/metrics-server-amd64:v0.2.1
        imagePullPolicy: Always
        volumeMounts:
        - name: tmp-dir
            mountPath: /tmp
        command:
        - /metrics-server
        - --source=kubernetes.summary_api:https://kubernetes.default?kubeletHttps=true&kubeletPort=10250&insecure=true
.
.
.

存檔離開
過3分鐘 kubectl get hpa 就看的到cpu使用率

目前只遇到以上這些問題

END


尚未有邦友留言

立即登入留言