EKS 使用 VPC 作為網路環境,在預設情況之下,Pod 僅能繼承 EKS worker node 上的 security group 限縮 inbound/outbound 流量。security group[1] 類似於 AWS VPC 環境中的防火牆概念,可以針對 IP CIDR 及 port 設置規則。而在 EKS VPC CNI Plugin 延伸 VPC security group 功能使 Pods 可以關聯指定 security group - Security groups for pods [2]。
v1.10.1
。$ kubectl describe daemonset aws-node --namespace kube-system | grep Image | cut -d "/" -f 2
amazon-k8s-cni-init:v1.10.1-eksbuild.1
amazon-k8s-cni:v1.10.1-eksbuild.1
$ aws eks describe-cluster --name ironman-2022 --query cluster.roleArn --output text
arn:aws:iam::111111111111:role/eksctl-ironman-2022-cluster-ServiceRole-33QTRI4AJ85B
AmazonEKSVPCResourceController
關聯至 EKS cluster IAM role。$ aws iam attach-role-policy --policy-arn arn:aws:iam::aws:policy/AmazonEKSVPCResourceController --role-name eksctl-ironman-2022-cluster-ServiceRole-33QTRI4AJ85B
aws-node
環境變數 ENABLE_POD_ENI
為 true
。$ kubectl set env daemonset aws-node -n kube-system ENABLE_POD_ENI=true
daemonset.apps/aws-node env updated
aws-node
Pod 更新。更新後,可以觀察到 node 更新 label vpc.amazonaws.com/has-trunk-attached
。$ kubectl -n kube-system get po
NAME READY STATUS RESTARTS AGE
aws-node-5mhsb 1/1 Running 0 2m
aws-node-fdmrf 0/1 Init:0/1 0 13s
aws-node-st44l 0/1 PodInitializing 0 13s
aws-node-vsqr4 0/1 Init:0/1 0 12s
...
$ kubectl get nodes -l vpc.amazonaws.com/has-trunk-attached=true
NAME STATUS ROLES AGE VERSION
ip-192-168-3-47.eu-west-1.compute.internal Ready <none> 3d15h v1.22.12-eks-ba74326
ip-192-168-31-166.eu-west-1.compute.internal Ready <none> 111m v1.22.12-eks-ba74326
ip-192-168-42-19.eu-west-1.compute.internal Ready <none> 111m v1.22.12-eks-ba74326
ip-192-168-71-180.eu-west-1.compute.internal Ready <none> 111m v1.22.12-eks-ba74326
SecurityGroupPolicy
(sg-policy.yaml
)及測試 aws-cli
Pod (aws-cli.yaml
)。$ cat ./sg-policy.yaml
apiVersion: vpcresources.k8s.aws/v1beta1
kind: SecurityGroupPolicy
metadata:
name: my-security-group-policy-demo
namespace: default
spec:
podSelector:
matchLabels:
role: demo-sg
securityGroups:
groupIds:
- sg-079157a483bcb371f
$ cat ./aws-cli.yaml
apiVersion: v1
kind: Pod
metadata:
name: aws-cli
namespace: default
labels:
role: demo-sg
spec:
containers:
- name: aws-cli
image: amazon/aws-cli:latest
command:
- sleep
- infinity
imagePullPolicy: IfNotPresent
restartPolicy: Always
SecurityGroupPolicy
及 Pod
。$ kubectl apply -f ./sg-policy.yaml
securitygrouppolicy.vpcresources.k8s.aws/my-security-group-policy-demo created
$ kubectl apply -f ./aws-cli.yaml
pod/aws-cli created
透過 kubectl describe
命令查看 Pod,以下列出部分資訊:
$ kubectl describe po aws-cli
Name: aws-cli
Namespace: default
Priority: 0
Node: ip-192-168-31-166.eu-west-1.compute.internal/192.168.31.166
Start Time: Thu, 29 Sep 2022 12:25:43 +0000
Labels: role=demo-sg
Annotations: kubernetes.io/psp: eks.privileged
vpc.amazonaws.com/pod-eni:
[{"eniId":"eni-09a7a909e06fc4be6","ifAddress":"02:b0:01:6c:b5:79","privateIp":"192.168.26.162","vlanId":1,"subnetCidr":"192.168.0.0/19"}]
Status: Running
IP: 192.168.26.162
IPs:
IP: 192.168.26.162
Containers:
aws-cli:
Container ID:
...
...
...
Restart Count: 0
Limits:
vpc.amazonaws.com/pod-eni: 1
Requests:
vpc.amazonaws.com/pod-eni: 1
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ldh4s (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-ldh4s:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
vpc.amazonaws.com/pod-eni:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 74s default-scheduler Successfully assigned default/aws-cli to ip-192-168-31-166.eu-west-1.compute.internal
Normal SecurityGroupRequested 74s vpc-resource-controller Pod will get the following Security Groups [sg-079157a483bcb371f]
Normal ResourceAllocated 73s vpc-resource-controller Allocated [{"eniId":"eni-09a7a909e06fc4be6","ifAddress":"02:b0:01:6c:b5:79","privateIp":"192.168.26.162","vlanId":1,"subnetCidr":"192.168.0.0/19"}] to the pod
Warning FailedCreatePodSandBox 73s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "e752a83b0e67ab7c60e6f3ec294c4cf6b7451802682d76a8addafa40f341fd7a" network for pod "aws-cli": networkPlugin cni failed to set up pod "aws-cli_default" network: add cmd: failed to assign an IP address to container
Normal SandboxChanged 72s kubelet Pod sandbox changed, it will be killed and re-created.
Normal Pulling 72s kubelet Pulling image "amazon/aws-cli:latest"
Normal Pulled 62s kubelet Successfully pulled image "amazon/aws-cli:latest" in 9.897131693s
Normal Created 60s kubelet Created container aws-cli
Normal Started 60s kubelet Started container aws-cli
可以觀察到多了 vpc.amazonaws.com/pod-eni
annotation 資訊: [{"eniId":"eni-09a7a909e06fc4be6","ifAddress":"02:b0:01:6c:b5:79","privateIp":"192.168.26.162","vlanId":1,"subnetCidr":"192.168.0.0/19"}]
使用 CloudWatch Logs insights 以下語法查看 audit logs:
filter @logStream like /^kube-apiserver-audit/
| fields @timestamp, @message
| sort @timestamp asc
| filter objectRef.name == 'aws-cli' AND verb == 'create'
| limit 10000
我們可以觀察到由 mutating wenhook vpc-resource-mutating-webhook
更新此 annotation。
...
...
"annotations": {
"authorization.k8s.io/decision": "allow",
"authorization.k8s.io/reason": "",
"mutation.webhook.admission.k8s.io/round_0_index_0": "{\"configuration\":\"0500-amazon-eks-fargate-mutation.amazonaws.com\",\"webhook\":\"0500-amazon-eks-fargate-mutation.amazonaws.com\",\"mutated\":false}",
"mutation.webhook.admission.k8s.io/round_0_index_2": "{\"configuration\":\"pod-identity-webhook\",\"webhook\":\"iam-for-pods.amazonaws.com\",\"mutated\":false}",
"mutation.webhook.admission.k8s.io/round_0_index_3": "{\"configuration\":\"vpc-resource-mutating-webhook\",\"webhook\":\"mpod.vpc.k8s.aws\",\"mutated\":true}",
"mutation.webhook.admission.k8s.io/round_1_index_2": "{\"configuration\":\"pod-identity-webhook\",\"webhook\":\"iam-for-pods.amazonaws.com\",\"mutated\":false}",
"mutation.webhook.admission.k8s.io/round_1_index_3": "{\"configuration\":\"vpc-resource-mutating-webhook\",\"webhook\":\"mpod.vpc.k8s.aws\",\"mutated\":false}",
"patch.webhook.admission.k8s.io/round_0_index_3": "{\"configuration\":\"vpc-resource-mutating-webhook\",\"webhook\":\"mpod.vpc.k8s.aws\",\"patch\":[{\"op\":\"add\",\"path\":\"/spec/containers/0/resources/limits\",\"value\":{\"vpc.amazonaws.com/pod-eni\":\"1\"}},{\"op\":\"add\",\"path\":\"/spec/containers/0/resources/requests\",\"value\":{\"vpc.amazonaws.com/pod-eni\":\"1\"}}],\"patchType\":\"JSONPatch\"}",
"podsecuritypolicy.policy.k8s.io/admit-policy": "eks.privileged",
"podsecuritypolicy.policy.k8s.io/validate-policy": "eks.privileged"
}
...
根據文件步驟,我們可以透過 Node 檢視 vpc.amazonaws.com/has-trunk-attached
label 來確認環境變數 ENABLE_POD_ENI
是否設定成功。因此使用 kubectl describe
查看對應的更新:
$ kubectl describe no ip-192-168-31-166.eu-west-1.compute.internal
Name: ip-192-168-31-166.eu-west-1.compute.internal
Roles: <none>
Labels: alpha.eksctl.io/cluster-name=ironman-2022
alpha.eksctl.io/nodegroup-name=ng1-public-ssh
...
...
...
vpc.amazonaws.com/has-trunk-attached=true
Annotations: node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
...
Capacity:
...
...
vpc.amazonaws.com/pod-eni: 9
Allocatable:
...
...
vpc.amazonaws.com/pod-eni: 9
...
...
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 125m (6%) 0 (0%)
memory 0 (0%) 0 (0%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
attachable-volumes-aws-ebs 0 0
vpc.amazonaws.com/pod-eni 1 1
如同 Pod 一樣,可以觀察到多了 vpc.amazonaws.com/pod-eni
annotation 資訊:
vpc.amazonaws.com/has-trunk-attached=true
接續,透過 CloudWatch Logs insights 以下語法查看 audit logs:
filter @logStream like /^kube-apiserver-audit/
| fields objectRef.name, objectRef.resource, @message
| sort @timestamp asc
| filter @message like 'vpc-resource-controller' AND verb == 'patch' AND objectRef.name == 'ip-192-168-31-166.eu-west-1.compute.internal'
| limit 10000
...
"verb": "patch",
"user": {
"username": "system:serviceaccount:kube-system:eks-vpc-resource-controller",
"uid": "b75dc772-f4fe-44ef-b928-5a1ba8a18eb4",
"groups": [
"system:serviceaccounts",
"system:serviceaccounts:kube-system",
"system:authenticated"
]
},
"sourceIPs": [
"10.0.51.57"
],
"userAgent": "controller/v0.0.0 (linux/amd64) kubernetes/$Format",
"objectRef": {
"resource": "nodes",
"name": "ip-192-168-31-166.eu-west-1.compute.internal",
"apiVersion": "v1",
"subresource": "status"
},
"responseStatus": {
"metadata": {},
"code": 200
},
"requestObject": {
"status": {
"capacity": {
"vpc.amazonaws.com/pod-eni": "9"
}
}
},
...
可以確認此 capacity 是由 username eks-vpc-resource-controller
更新此 node capacity 資訊 vpc.amazonaws.com/pod-eni
。
由上述初步查看 Node、Pod 資訊及 audit log 可以了解:
ENABLE_POD_ENI
環境變數後,eks-vpc-resource-controller
會更新 Node label 及 vpc.amazonaws.com/pod-eni
資訊。vpc-resource-mutating-webhook
於 Pod mutating 階段更新 vpc.amazonaws.com/pod-eni
資訊於 annotation。下一篇我們將會探討如何實現 Pod 獨立於 Node 上的 security group。
以上資訊透過 EKS 所提供 Logs 來驗證上游 Kubernetes 運作原理,倘若上述內文有所錯誤,隨時可以留言或是私訊我。