第 12 屆 iThome 鐵人賽

DAY 3

DevOps

從題目中學習k8s系列第 3 篇

【從題目中學習k8s】-【Day3】建立K8s Cluster環境-以kubeadm為例

12th鐵人賽 devops cicd kubernetes k8s

店小二

團隊You Reverse 你逆

2020-09-09 00:43:14

49697 瀏覽

分享至

title: 【從題目中學習k8s】-【Day3】建立K8s Cluster環境-以kubeadm為例
description: 以無比的恆毅力堅持30天鍊成鐵人--連續30天，一天發表一篇IT技術文章

【從題目中學習k8s】-【Day3】建立K8s Cluster-以kubeadm為例

tags: `DevOps` `CICD` `K8s` `Docker`

前言

昨天我們有提到，K8s是一個集群的架構，每個集群會有Control Plane和Node。兩者可安裝在不同的實體機器或虛擬機上，今天就來示範如何建置K8s集群吧！今天的教學示範安裝在VMWare Ubuntu18.04上。另外，虛擬機的配置有一些最低的限制需要注意：

Control Plane
- CPU : 2 core
- RAM : 4 G
Node
- CPU : 1 core
- RAM : 4 G

若Control Plane只配1 core的CPU，之後安裝時Control Plane會無法初始化哦~

接下來示範如何安裝，這邊介紹的安裝方式是以kubeadm安裝，這個方法是最簡單也最直覺的方法，適合一般的使用者，無論自己練習或實作都可以。有些套件在Control Plane和Node上皆必須安裝，有些則只要單獨安裝在Control Plane或Node上即可。下面就開始吧~

大部分的K8s教學都是建議新手使用minikube來練習，因為minikube是一個單一節點的集群架構，環境相對單純。但是筆者認為minikube和一般K8s Cluster還是有很多不同的地方，若只單用minikube的話會少接觸到很多K8s必備的知識。而且考試時的環境通常也是多節點的，提早適應有好無壞喔~

Kubernetes安裝教學 - kubeadm

在Control Plane & Node上皆須執行

Step 1: 更新 repositpories

$ sudo su
$ apt-get update

Step 2: 關閉 swap space

$ swapoff -a
$ vim /etc/fstab

## 將 /swapfile 這一行註解起來(前方加上 '#' )

Step 3(optional): 修改 hostname

$ vim /etc/hostname

g8master

## 將原本名稱刪掉，任取一個新名稱
## 這邊我將Contol Plane改為g8master，而Node改為g8node

接著重啟虛擬機，hostname即成功修改

Step 4: 設定DNS - 修改 host file

$ vim /etc/hosts

## 加入指令 

<IP-Address-of-node><tab><hostname-of-node>

例如 :

192.168.132.241	g8master
192.168.132.173	g8node1
192.168.132.178	g8node2
...

同一個集群的所有節點都需更改host file喔~

Step 5: 安裝 ssh

$ apt-get install openssh-server

Step 6: 安裝 Docker

$ apt-get update
$ apt-get install -y docker.io

Step 7: 建立 kubernetes 環境

$ apt-get update && apt-get install -y apt-transport-https curl
$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
$ cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF
$ apt-get update

Step 8: 安裝 kubernetes 工具

$ apt-get install -y kubelet kubeadm kubectl

Step 9: 更新 kubernetes configuration

$ vim /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

...
## 加入下面指令
Environment=”cgroup-driver=systemd/cgroup-driver=cgroupfs”

只在Control Plane上執行

初始化Control Plane

$ sudo kubeadm init --pod-network-cidr=<ip-of-container-network-interface> --apiserver-advertise-address=<ip-address-of-master>

若要使用Calico CNI (Container Network Interface)，則將 ip-of-container-network-interface 改為192.168.0.0/16；若要使用Flannel CNI，則改為10.244.0.0/16。初始化後，即可看到以下訊息

W0225 08:10:50.654131   11265 validation.go:28] Cannot validate kube-proxy config - no validator is available
W0225 08:10:50.654560   11265 validation.go:28] Cannot validate kubelet config - no validator is available
[init] Using Kubernetes version: v1.17.3
[preflight] Running pre-flight checks
	[WARNING Service-Docker]: docker service is not enabled, please run 'systemctl enable docker.service'
	[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kmaster kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.67.128]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kmaster localhost] and IPs [192.168.67.128 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kmaster localhost] and IPs [192.168.67.128 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
W0225 08:11:46.664565   11265 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-scheduler"
W0225 08:11:46.665274   11265 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 16.506641 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.17" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node kmaster as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node kmaster as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: 9gg19y.0i05a6asikxkdv41
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.67.128:6443 --token 9gg19y.0i05a6asikxkdv41 \
    --discovery-token-ca-cert-hash sha256:df730b9dc365367eca126dba28d64268e200fd0d81ebd0563668ede2cdd4eb7d

接著，依序輸入系統給的三個指令

$ mkdir -p $HOME/.kube
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config

並且將最後兩行指令存起來，之後g8node要加入此集群，執行此指令即可。

kubeadm join 192.168.67.128:6443 --token 9gg19y.0i05a6asikxkdv41 \
    --discovery-token-ca-cert-hash sha256:df730b9dc365367eca126dba28d64268e200fd0d81ebd0563668ede2cdd4eb7d

只在Node上執行

這邊一般來說，只要在Node上執行剛剛產生的 kubeadm join… --discovery-token-ca-cert-hash 指令，即可成功加入K8s集群。

但若是隔了一段時間後，有新的Node想加入此集群，會發現即使輸入正確指令，也無法成功 join，卡在以下畫面

$ sudo kubeadm join 192.168.67.128:6443 --token 9gg19y.0i05a6asikxkdv41 \
     --discovery-token-ca-cert-hash sha256:df730b9dc365367eca126dba28d64268e200fd0d81ebd0563668ede2cdd4eb7d 
[sudo] password for george: 
W0227 08:13:19.054898   14242 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
[preflight] Running pre-flight checks
	[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/

這是因為token時間太久失效了(token的有效期限default為24小時)，這時需要通過下列步驟生成一個新的token (在Control Plane上生成)

$ kubeadm token create

然後通過下列指令查看新的token

$ kubeadm token list
TOKEN                     TTL         EXPIRES                     USAGES                   DESCRIPTION                                                EXTRA GROUPS
chhuh4.hcrk93gse24eggkz   23h         2020-02-28T08:16:49-08:00   authentication,signing   <none>                                                     system:bootstrappers:kubeadm:default-node-token

獲取ca認證sha256編碼的hash值

$ openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'

將舊token更換為新token，hash值也換成新的，再重新join一次，即可成功加入拉~

$ sudo kubeadm join 192.168.67.128:6443 --token chhuh4.hcrk93gse24eggkz     --discovery-token-ca-cert-hash sha256:df730b9dc365367eca126dba28d64268e200fd0d81ebd0563668ede2cdd4eb7d 
W0227 08:17:39.125117   15247 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
[preflight] Running pre-flight checks
	[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.17" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

執行 kubectl get node驗證一下，可以發現兩個Node和一個Control Plane，代表完成了，恭喜你！

NAME      STATUS     ROLES    AGE   VERSION
g8master   NotReady   master   2d    v1.18.8
g8node1    NotReady   <none>   2ds   v1.18.5
g8node2    NotReady   <none>   2ds   v1.18.5

到這邊為止，就算成功建立 K8s 集群囉~ˇ但是有沒有發現這些節點的STATUS都是NotReady?代表我們可能還有一些工作沒完成。

網路設定

目前還不知道原因是甚麼導致節點處於NotReady STATUS狀態，通常節點會發生問題的原因都出在kubelet，所以我們先查詢一下kubelet 的log：

$ journalctl -f -u kubelet

會發現一直重複以下訊息：

$ journalctl -f -u kubelet
-- Logs begin at Tue 2020-02-25 07:24:12 PST. --
Feb 27 08:33:27 kmaster kubelet[999]: W0227 08:33:27.039411     999 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
Feb 27 08:33:27 kmaster kubelet[999]: E0227 08:33:27.423510     999 kubelet.go:2183] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Feb 27 08:33:28 kmaster kubelet[999]: E0227 08:33:28.644022     999 summary_sys_containers.go:47] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get container info for "/system.slice/docker.service": unknown container "/system.slice/docker.service"

這是因為 kubelet 參數多了 network-plugin=cni，但卻沒安裝 CNI。解決方法有兩個:

1. 打開設定檔把 network-plugin=cni 的參數移除(不建議)

設定檔為 /var/lib/kubelet/kubeadm-flags.env (k8s v1.11以後版本皆適用)

$ vim /var/lib/kubelet/kubeadm-flags.env

將 --network-plugin=cni 這一參數刪掉，修改後應是這樣

KUBELET_KUBEADM_ARGS="--cgroup-driver=cgroupfs --pod-infra-container-image=k8s.gcr.io/pause:3.1 --resolv-conf=/run/systemd/resolve/resolv.conf"

改完之後重新啟動

$ systemctl daemon-reload
$ systemctl restart kubelet

再用 kubectl get no查看訊息

$ kubectl get no
NAME      STATUS     ROLES    AGE   VERSION
g8master   Ready      master   2d    v1.18.8
g8node     NotReady   <none>   20m   v1.18.5

g8master狀態成功Ready，g8node記得也要設定才會Ready喔~

不建議採用此方法，因為整個集群中沒有CNI，彼此間的溝通和網路設定會出問題，建議採用第2種方法

2. 安裝CNI (Network Plugin) (建議)

推薦此方法，安裝Network Plugin，這裡示範安裝weave net。

$ kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
serviceaccount/weave-net created
clusterrole.rbac.authorization.k8s.io/weave-net created
clusterrolebinding.rbac.authorization.k8s.io/weave-net created
role.rbac.authorization.k8s.io/weave-net created
rolebinding.rbac.authorization.k8s.io/weave-net created
daemonset.apps/weave-net created

等待這些物件都創建完成後，檢查Node狀態

$ kubectl get no
NAME      STATUS   ROLES    AGE   VERSION
g8master   Ready    master   2d    v1.18.8
g8node1    Ready    <none>   27m   v1.18.5

都會是Ready狀態囉~

額外設定

有沒有發現，g8node的 ROLES標記為 <none>，而g8master則標記了master，這是因為k8s只會標記master節點，其他節點default是沒有標記的，我們可以手動為任一節點設置 ROLES

$ kubectl label node g8node1 node-role.kubernetes.io/worker=worker
node/g8node1 labeled
$ kubectl get no
NAME      STATUS   ROLES    AGE   VERSION
g8master   Ready    master   2d    v1.18.8
g8node1    Ready    worker   27m   v1.18.5

成功啦，所有安裝步驟到這裡就OK囉，開始練習使用Kubernetes吧~

結論

今天主要介紹以kubeadm工具建置K8s 集群，以及Control-Plane和Node各自的設定。用kubeadm的好處就是不用自己安裝一堆組件啦~ kubeadm都幫你打包好了，是不是非常方便呢?有了K8s集群後，明天就開始介紹K8s到底如何管理Container吧~ 好啦，今天就到這囉~ 謝謝大家~