我們從 K8S 1.23 生到 1.24的時候,有次遇到 deploy 部署失敗,logs 如下
module.eks.aws_eks_node_group.node[0]: Still modifying... [id=xxx-prod:xxx-prod-node-1, 36m50s elapsed]
Error: error waiting for EKS Node Group (xxx-prod:xxx-prod-node-1) version update: unexpected state 'Failed', wanted target 'Successful'. last error: 1 error occurred:
* ip-xxxx.xxxx.compute.internal: PodEvictionFailure: Reached max retries while trying to evict pods from nodes in node group xxx-prod-node-1
我們可以發現出問題的動作發生在某一台 node 執行Pod Eviction 時 fail了,查找 root cause 發現我們升級 EKS 時,會有 rolling out 的過程。有一個 pod 他不斷試著關閉,但一直沒有成功。導致 node 無法做 rolling 切換。
$ kubectl get node -A
NAME STATUS ROLES AGE VERSION
ip-1 Ready <none> 111m v1.23.17-eks-a59e1f0
ip-2 Ready <none> 118m v1.23.17-eks-a59e1f0
ip-3 Ready <none> 114m v1.23.17-eks-a59e1f0
ip-4 Ready <none> 32m v1.24.11-eks-a59e1f0
ip-5 Ready <none> 121m v1.23.17-eks-a59e1f0
ip-6 Ready <none> 121m v1.23.17-eks-a59e1f0
ip-7 NotReady,SchedulingDisabled <none> 32m v1.24.11-eks-a59e1f0
過一陣子 timeout 後,會發現新開的 1.24 都被關閉,因為舊的 pod 卡住,導致舊的 node 無法關閉
$ kubectl get nodes -A
NAME STATUS ROLES AGE VERSION
ip-1 Ready <none> 122m v1.23.17-eks-a59e1f0
ip-2 Ready <none> 130m v1.23.17-eks-a59e1f0
ip-3 Ready <none> 126m v1.23.17-eks-a59e1f0
ip-5 Ready <none> 132m v1.23.17-eks-a59e1f0
ip-6 Ready <none> 132m v1.23.17-eks-a59e1f0
我們改使用 drain 試試看,kubectl drain
是用於 Kubernetes 中的維護和升級 node 的命令。當需要對 node 執行升級或維護時,可以使用 kubectl drain
來「安全地」驅逐該 node 上的所有應用,確保正在運行的 Pods 能夠在其它可用 node 上重新安排並繼續提供服務。
$ kubectl drain ip-10-13-4-151.ap-northeast-1.compute.internal --ignore-daemonsets --delete-local-data
node/ip-10-13-4-151.ap-northeast-1.compute.internal already cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-system/aws-cloudwatch-logs-aws-for-fluent-bit-88ccw, kube-system/aws-cloudwatch-metrics-qw5fx, kube-system/aws-node-57vpv, kube-system/ebs-csi-node-chvvs, kube-system/kube-proxy-zfrrh
evicting pod kube-system/coredns-xxx-5l6lf
.
.
.
evicting pod xxx-prod/xxx-xxx-api-service-deployment
error when evicting pod "cluster-autoscaler-aws-cluster-autoscaler-7c86765cf4-bmbnq" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
pod/coredns-59847d77c8-5l6lf evicted
.
.
pod/aws-load-balancer-controller-c9cd98dd6-sbzng evicted
evicting pod kube-system/cluster-autoscaler-aws-cluster-autoscaler-7c86765cf4-bmbnq
error when evicting pod "cluster-autoscaler-aws-cluster-autoscaler-7c86765cf4-bmbnq" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
evicting pod kube-system/cluster-autoscaler-aws-cluster-autoscaler-7c86765cf4-bmbnq
error when evicting pod "cluster-autoscaler-aws-cluster-autoscaler-7c86765cf4-bmbnq" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
evicting pod kube-system/cluster-autoscaler-aws-cluster-autoscaler-7c86765cf4-bmbnq
error when evicting pod "cluster-autoscaler-aws-cluster-autoscaler-7c86765cf4-bmbnq" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
使用之後發現原來是 cluster autoscaler 的 pod 卡住了,至於解法我們留到下一章節來提