Audit 主要是提供稽核時所需的紀錄,當事件發生時會需要是(1)誰 (2)在什麼時間 (3)對什麼東西 (4)做了什麼
詳細來說會是當發生問題時可以回答以下問題
發生了什麼事?
什麼時候發生的?
是誰啟動的?
對什麼東西發生的?
在哪裡觀察到的?
從哪裡啟動的?
要到哪裡去?
Audit log 主要是紀錄 kube-apiserver 的行為,在請求發送後每個階段都會有對應的執行程序,之後會根據所設定要 policy 的方式並透過 log files 以及 webhook 的方式進行記錄
在請求時會分成四種情形
注意
Audit log 的開啟後因為需要儲存請求的內容,因此會消耗 API server 的 memory
Audit policy 主要提供設定要對哪些事件做紀錄,並且會針對第一個吻合的 rule 就會被記錄下來,不會再繼續檢查後面的規則。
紀錄的形式可以是
policy 儲存位置 可以透過在kube-apiserver 中設定 flag -audit-policy-file
主要結構如下
apiVersion: audit.k8s.io/v1
kind: Policy
omitStages:
- "RequestReceived" # 全域設定忽略所有RequestReceived的內容
rules:
- level: RequestResponse # 可以是None, Metadata, Request, RequestResponse
users: ["system:kube-proxy"] # 針對特定 user 紀錄
userGroups: ["system:authenticated"] # 針對特定 userGroup 紀錄
verbs: ["watch"] # 針對特定行為 verb
namespaces: ["kube-system"] # namespace
resources:
- group: "" # 根據 core API group 的分類
resources: ["pods"] # 也可以是 "pods/log", "pods/status"
policy 範例
apiVersion: audit.k8s.io/v1 # This is required.
kind: Policy
# Don't generate audit events for all requests in RequestReceived stage.
omitStages: # 忽略所有RequestReceived的內容
- "RequestReceived"
rules:
# Log pod changes at RequestResponse level
- level: RequestResponse
resources:
- group: ""
# Resource "pods" doesn't match requests to any subresource of pods,
# which is consistent with the RBAC policy.
resources: ["pods"]
# Log "pods/log", "pods/status" at Metadata level
- level: Metadata
resources:
- group: ""
resources: ["pods/log", "pods/status"]
# Don't log requests to a configmap called "controller-leader"
- level: None
resources:
- group: ""
resources: ["configmaps"]
resourceNames: ["controller-leader"]
# Don't log watch requests by the "system:kube-proxy" on endpoints or services
- level: None
users: ["system:kube-proxy"]
verbs: ["watch"]
resources:
- group: "" # core API group
resources: ["endpoints", "services"]
# Don't log authenticated requests to certain non-resource URL paths.
- level: None
userGroups: ["system:authenticated"]
nonResourceURLs:
- "/api*" # Wildcard matching.
- "/version"
# Log the request body of configmap changes in kube-system.
- level: Request
resources:
- group: "" # core API group
resources: ["configmaps"]
# This rule only applies to resources in the "kube-system" namespace.
# The empty string "" can be used to select non-namespaced resources.
namespaces: ["kube-system"]
# Log configmap and secret changes in all other namespaces at the Metadata level.
- level: Metadata
resources:
- group: "" # core API group
resources: ["secrets", "configmaps"]
# Log all other resources in core and extensions at the Request level.
- level: Request
resources:
- group: "" # core API group
- group: "extensions" # Version of group should NOT be included.
# A catch-all rule to log all other requests at the Metadata level.
- level: Metadata
# Long-running requests like watches that fall under this rule will not
# generate an audit event in RequestReceived.
omitStages:
- "RequestReceived"
在我們寫完規則後再來就是可以將 log 存放於何處,這邊有兩個選項
輸出格式注意 Patch 行為所輸出的格式會跟其他 api 事件輸出會有所不同,patch 像是 kubectl scale
, kubectl edit
[
{
"op": "replace",
"path": "/spec/parallelism",
"value": 0
},
{
"op": "remove",
"path": "/spec/template/spec/containers/0/terminationMessagePolicy"
}
]
Log backend 可以設定
--audit-log-path
: log 指定 log 存至哪,如果沒有特別指定路徑會被視為使用 standard out--audit-log-maxage
: 最多可儲存時間--audit-log-maxbackup
: 最多可存多少檔案--audit-log-maxsize
: 單一檔案最大的大小如果 kube-apiserver 是以 pod 形式建立,除了指定這些配置外還需要額外設定 mount volume
例如:
- --audit-policy-file=/etc/kubernetes/audit-policy.yaml
- --audit-log-path=/var/log/kubernetes/audit/audit.log
需要額外設定其 volume path
volumeMounts:
- mountPath: /etc/kubernetes/audit-policy.yaml
name: audit
readOnly: true
- mountPath: /var/log/kubernetes/audit/
name: audit-log
readOnly: false
volumes:
- name: audit
hostPath:
path: /etc/kubernetes/audit-policy.yaml
type: File
- name: audit-log
hostPath:
path: /var/log/kubernetes/audit/
type: DirectoryOrCreate
使用 webhook 時會有兩個 flag 可以設定
--audit-webhook-config-file
:webhook config 檔位置--audit-webhook-initial-backoff
: retry 時間Event batching:除一筆一筆輸入外也可以設定批次輸入,這樣操作可以有效的減少I/O的使用
Parameter tuning:根據 API server 的負載來調整稽核參數
Log entry truncation:防止稽核事件過大造成問題的機制。
參考資料
[1] https://kubernetes.io/docs/tasks/debug/debug-cluster/audit/