iT邦幫忙

2022 iThome 鐵人賽

DAY 13
0
DevOps

那些文件沒告訴你的 AWS EKS系列 第 13

[13] 為什麼透過 CloudFormation template 或 `ekstcl` 啟用的 self-managed node 可以自動加入 EKS cluster

  • 分享至 

  • xImage
  •  

在先前 Day3 Day4 時,我們討論到 Amazon EKS AMI Build Specification script 定義了自動化設置 kubelet 設定檔,使 node 可以順利地加入節點。

故本文章將探討 CloudFormation template 及 eksctl 命令是如何根據此 bootstrap.sh 啟用 worker node。

CloudFormation template

若我們使用 CloudFormation 方式建立 self-managed nodes,我們可以使用以下 curl command 下載官方提供的 CloudFormation template。

$ curl -o amazon-eks-nodegroup.yaml https://s3.us-west-2.amazonaws.com/amazon-eks/cloudformation/2020-10-29/amazon-eks-nodegroup.yaml

以下列舉部分 CloudFormation template 資訊:

NodeLaunchTemplate:
    Type: "AWS::EC2::LaunchTemplate"
    Properties:
      LaunchTemplateData:
        BlockDeviceMappings:
          - DeviceName: /dev/xvda
            Ebs:
              DeleteOnTermination: true
              VolumeSize: !Ref NodeVolumeSize
              VolumeType: gp2
        IamInstanceProfile:
          Arn: !GetAtt NodeInstanceProfile.Arn
        ImageId: !If
          - HasNodeImageId
          - !Ref NodeImageId
          - !Ref NodeImageIdSSMParam
        InstanceType: !Ref NodeInstanceType
        KeyName: !Ref KeyName
        SecurityGroupIds:
        - !Ref NodeSecurityGroup
        UserData: !Base64
          "Fn::Sub": |
            #!/bin/bash
            set -o xtrace
            /etc/eks/bootstrap.sh ${ClusterName} ${BootstrapArguments}
            /opt/aws/bin/cfn-signal --exit-code $? \
                     --stack  ${AWS::StackName} \
                     --resource NodeGroup  \
                     --region ${AWS::Region}
        MetadataOptions:
          HttpPutResponseHopLimit : 2
          HttpEndpoint: enabled
          HttpTokens: !If
            - IMDSv1Disabled
            - required
            - optional

  NodeGroup:
    Type: "AWS::AutoScaling::AutoScalingGroup"
    Properties:
      DesiredCapacity: !Ref NodeAutoScalingGroupDesiredCapacity
      LaunchTemplate:
        LaunchTemplateId: !Ref NodeLaunchTemplate
        Version: !GetAtt NodeLaunchTemplate.LatestVersionNumber
      MaxSize: !Ref NodeAutoScalingGroupMaxSize
      MinSize: !Ref NodeAutoScalingGroupMinSize
      Tags:
        - Key: Name
          PropagateAtLaunch: true
          Value: !Sub ${ClusterName}-${NodeGroupName}-Node
        - Key: !Sub kubernetes.io/cluster/${ClusterName}
          PropagateAtLaunch: true
          Value: owned
      VPCZoneIdentifier: !Ref Subnets
    UpdatePolicy:
      AutoScalingRollingUpdate:
        MaxBatchSize: 1
        MinInstancesInService: !Ref NodeAutoScalingGroupDesiredCapacity
        PauseTime: PT5M

其中我們可以觀察到, CloudFormation Logical ID [5] NodeGroup 使用 AWS:: AutoScaling:: AutoScalingGroup [6] 而非 AWS:: EKS:: Nodegroup [7]。值得注意的是,AWS:: EKS:: Nodegroup 啟用的是 managed node group。

接續,NodeGroup 關聯了 NodeLaunchTemplate 其定義使用了 EC2 userdata [8],並使用了 cfn-signal [9]--exit-code 來確定此 bootstrap.sh 是否成功。

eksctl

透過 eksctl 建立 self-managed node group 資源後,可以查看到會建立對應的 CloudFormation Stack。故我們也能使用以下 AWS CLI aws cloudformation describe-stack-resources 命令,與 CloudFormation 所建立的方式類似,也是透過定義 Auto Scaling Group 及定義 Launch Template 來建立。

$ aws cloudformation describe-stack-resources \
    --stack-name eksctl-ironman-2022-nodegroup-demo-self-ng
...
...
...
        {
            "StackName": "eksctl-ironman-2022-nodegroup-demo-self-ng",
            "StackId": "arn:aws:cloudformation:eu-west-1:111111111111:stack/eksctl-ironman-2022-nodegroup-demo-self-ng/e2083840-3cda-11ed-882d-026b9e583afb",
            "LogicalResourceId": "NodeGroup",
            "PhysicalResourceId": "eksctl-ironman-2022-nodegroup-demo-self-ng-NodeGroup-1Q19K6MP0HTFJ",
            "ResourceType": "AWS::AutoScaling::AutoScalingGroup",
            "Timestamp": "2022-09-25T14:07:16.122000+00:00",
            "ResourceStatus": "CREATE_COMPLETE",
            "DriftInformation": {
                "StackResourceDriftStatus": "NOT_CHECKED"
            }
        },
        {
            "StackName": "eksctl-ironman-2022-nodegroup-demo-self-ng",
            "StackId": "arn:aws:cloudformation:eu-west-1:111111111111:stack/eksctl-ironman-2022-nodegroup-demo-self-ng/e2083840-3cda-11ed-882d-026b9e583afb",
            "LogicalResourceId": "NodeGroupLaunchTemplate",
            "PhysicalResourceId": "lt-011b90bb7c3e105d3",
            "ResourceType": "AWS::EC2::LaunchTemplate",
            "Timestamp": "2022-09-25T14:06:37.330000+00:00",
            "ResourceStatus": "CREATE_COMPLETE",
            "DriftInformation": {
                "StackResourceDriftStatus": "NOT_CHECKED"
            }
        },
...
...

接著,我們可以透過 aws cloudformation get-template 查看原先的 userdata 定義,此輸出看起來像是經 base64 encode。

$ aws cloudformation get-template --stack-name eksctl-ironman-2022-nodegroup-demo-self-ng --query TemplateBody.Resources.NodeGroupLaunchTemplate.Properties.LaunchTemplateData.UserData
"H4sIAAAAAAAA/7xYbXPiOrL+nl+hy0ydnLl3DLaBnCFV3FpebMKLTSxbMnh2KiUsBb/KHlsEwmz++5YhZMJMztSera39QmLpedqtVquflt/5Sbahkp/x+3B9kRM/JmtWXgO+SZKLYsP9lF5fSEACjQdSNJJw1TgQGqVfhLkoGywufZE0VlkmSlGQvE4StV4Gf4kSsCRnRcXaFqFgd/dh...
... SKIP CONTENT ...
...
pIBWeClafi1HhDti7fcOjMzJ9+cDgyz4Sme9kQaX72mlfQn78lvCXF19LzF4anGngHKLsnm+S4HyRRQbQpBQg58EnJPgKeiUqcaTWyWW242PwndOV7e/W2tPwzAAD//+PPRyJOEgAA"

因此我們可以將此輸出內容透過 base64 decode,透過 file 查看到此 decode 格式為 gzip 的壓縮格式。

$ aws cloudformation get-template --stack-name eksctl-ironman-2022-nodegroup-demo-self-ng --query TemplateBody.Resources.NodeGroupLaunchTemplate.Properties.LaunchTemplateData.UserData --output text | base64 -d > ec2-userdata

$ file ec2-userdata
ec2-userdata: gzip compressed data

而透過 gunzip 解壓縮此文件後,確認此文件為 cloud-init 格式,並且一樣定義了 bootstrap.sh

$ mv ec2-userdata ec2-userdata.gz
$ gunzip ./ec2-userdata.gz

$ cat ./ec2-userdata
#cloud-config
packages: null
runcmd:
- - /var/lib/cloud/scripts/eksctl/bootstrap.al2.sh
- - /var/lib/cloud/scripts/eksctl/bootstrap.helper.sh
write_files:
- content: '{}'
  owner: root:root
  path: /etc/eksctl/kubelet-extra.json
  permissions: "0644"
- content: |-
    NODE_TAINTS=
    CLUSTER_DNS=10.100.0.10
    CONTAINER_RUNTIME=dockerd
    CLUSTER_NAME=ironman-2022
    API_SERVER_URL=https://XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.gr7.eu-west-1.eks.amazonaws.com
    B64_CLUSTER_CA=LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFR...
... ...
V1lZRExMUDdLCm0xVUJQUWdzTzRQQlREUjlaLzhpbnZDV0FiT0szM2Z6OVZqU3dBbjlhQ0lXbU5FY2dVMkFUWm1FN0N4WEUrbFkKOFhjPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
    NODE_LABELS=alpha.eksctl.io/cluster-name=ironman-2022,alpha.eksctl.io/nodegroup-name=demo-self-ng
  owner: root:root
  path: /etc/eksctl/kubelet.env
  permissions: "0644"
- content: |
    #!/bin/bash

    set -o errexit
    set -o pipefail
    set -o nounset

    source /var/lib/cloud/scripts/eksctl/bootstrap.helper.sh

    echo "eksctl: running /etc/eks/bootstrap"
    /etc/eks/bootstrap.sh "${CLUSTER_NAME}" \
      --apiserver-endpoint "${API_SERVER_URL}" \
      --b64-cluster-ca "${B64_CLUSTER_CA}" \
      --dns-cluster-ip "${CLUSTER_DNS}" \
      --kubelet-extra-args "${KUBELET_EXTRA_ARGS}" \
      --container-runtime "${CONTAINER_RUNTIME}"

    echo "eksctl: merging user options into kubelet-config.json"
    trap 'rm -f ${TMP_KUBE_CONF}' EXIT
    jq -s '.[0] * .[1]' "${KUBELET_CONFIG}" "${KUBELET_EXTRA_CONFIG}" > "${TMP_KUBE_CONF}"
    mv "${TMP_KUBE_CONF}" "${KUBELET_CONFIG}"

    systemctl daemon-reload
    echo "eksctl: restarting kubelet-eks"
    systemctl restart kubelet
    echo "eksctl: done"
  owner: root:root
  path: /var/lib/cloud/scripts/eksctl/bootstrap.al2.sh
  permissions: "0755"
- content: |
    #!/bin/bash

    set -o errexit
    set -o pipefail
    set -o nounset

    source /etc/eksctl/kubelet.env # file written by bootstrapper

    # Use IMDSv2 to get metadata
    TOKEN="$(curl --silent -X PUT -H "X-aws-ec2-metadata-token-ttl-seconds: 600" http://169.254.169.254/latest/api/token)"
    function get_metadata() {
      curl --silent -H "X-aws-ec2-metadata-token: $TOKEN" "http://169.254.169.254/latest/meta-data/$1"
    }

    API_SERVER_URL="${API_SERVER_URL}"
    B64_CLUSTER_CA="${B64_CLUSTER_CA}"
    INSTANCE_ID="$(get_metadata instance-id)"
    INSTANCE_LIFECYCLE="$(get_metadata instance-life-cycle)"
    CLUSTER_DNS="${CLUSTER_DNS:-}"
    NODE_TAINTS="${NODE_TAINTS:-}"
    MAX_PODS="${MAX_PODS:-}"
    NODE_LABELS="${NODE_LABELS},node-lifecycle=${INSTANCE_LIFECYCLE},alpha.eksctl.io/instance-id=${INSTANCE_ID}"

    KUBELET_ARGS=("--node-labels=${NODE_LABELS}")
    [[ -n "${NODE_TAINTS}" ]] && KUBELET_ARGS+=("--register-with-taints=${NODE_TAINTS}")
    # --max-pods as a CLI argument is deprecated, this is a workaround until we deprecate support for maxPodsPerNode
    [[ -n "${MAX_PODS}" ]] && KUBELET_ARGS+=("--max-pods=${MAX_PODS}")
    KUBELET_EXTRA_ARGS="${KUBELET_ARGS[@]}"

    CLUSTER_NAME="${CLUSTER_NAME}"
    KUBELET_CONFIG='/etc/kubernetes/kubelet/kubelet-config.json'
    KUBELET_EXTRA_CONFIG='/etc/eksctl/kubelet-extra.json'
    TMP_KUBE_CONF='/tmp/kubelet-conf.json'
    CONTAINER_RUNTIME="${CONTAINER_RUNTIME:-dockerd}" # default for al2 just in case, not used in ubuntu
  owner: root:root
  path: /var/lib/cloud/scripts/eksctl/bootstrap.helper.sh
  permissions: "0755"

根據 eksctl nodebootstrap 文件,上述 bootstrapping/userdata 則是由 nodebootstrap 定義的邏輯,又依據不同作業系統,如 Ubuntu、Amazon Linux 劃分 ubuntu.goal2.go。因此 UserData 會動態地設定以下設定檔:

  • kubelet-extra.json - kubelet user configuration
  • docker-extra.json - Docker Daemon extra config
  • kubelet.env:kubelet 環境變數 vars for kubelet

總結

不論是透過官方文件所提供 CloudFormation template 或是 eksctl 命令方式所建立的 CloudFormation Stack,皆使用了 userdata 方式設定 EKS worker node AMI 設置 bootstrap.sh script 設定必要參數。而 eksctl 更近一步地在 provision 階段,透過 nodebootstrap pkg 動態建立 cloud-init 設定檔。


上述資訊透過 EKS 所提供 Logs 來驗證上游 Kubernetes 運作原理,倘若上述內文有所錯誤,隨時可以留言或是私訊我。

參考文件

  1. [03] 為什麼 EKS worker node 可以自動加入 EKS cluster(一) -https://ithelp.ithome.com.tw/articles/10293460
  2. [04] 為什麼 EKS worker node 可以自動加入 EKS cluster(二) - https://ithelp.ithome.com.tw/articles/10294101
  3. Amazon EKS AMI Build Specification - https://github.com/awslabs/amazon-eks-ami
  4. Launching self-managed Amazon Linux nodes - https://docs.aws.amazon.com/eks/latest/userguide/launch-workers.html
  5. Resources - Resource fields - https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/resources-section-structure.html#resources-section-structure-resource-fields
  6. AWS:: AutoScaling:: AutoScalingGroup - https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-as-group.html
  7. AWS:: EKS:: Nodegroup - https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-eks-nodegroup.html
  8. Run commands on your Linux instance at launch - https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html
  9. cfn-signal - https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-signal.html
  10. Userdata - https://cloudbase-init.readthedocs.io/en/latest/userdata.html
  11. https://github.com/weaveworks/eksctl/tree/main/pkg/nodebootstrap

上一篇
[12] 為什麼 CoreDNS 可以解析 VPC endpoint 及外部 DNS
下一篇
[14] 為什麼 security group 可以關聯至單獨的 EKS Pod - Security groups for pods(一)
系列文
那些文件沒告訴你的 AWS EKS30
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言