[Day27] 6-5. Use Kubernetes audit logs to monitor access

2025 iThome 鐵人賽

DAY 27

Cloud Native

我在 CKS 考完只拿 47% 後痛定思痛決定好好準備內容系列第 27 篇

17th鐵人賽

hsiehbing

2025-10-03 00:13:29

104 瀏覽

分享至

Kubernetes 中 Audit 的介紹

Audit 主要是提供稽核時所需的紀錄，當事件發生時會需要是(1)誰 (2)在什麼時間 (3)對什麼東西 (4)做了什麼

詳細來說會是當發生問題時可以回答以下問題

發生了什麼事？
什麼時候發生的？
是誰啟動的？
對什麼東西發生的？
在哪裡觀察到的？
從哪裡啟動的？
要到哪裡去？

Audit log 主要是紀錄 kube-apiserver 的行為，在請求發送後每個階段都會有對應的執行程序，之後會根據所設定要 policy 的方式並透過 log files 以及 webhook 的方式進行記錄

在請求時會分成四種情形

RequestReceived - 當請求發出後第一時間接受的資訊
ResponseStarted - 在 response body 還沒傳送前 response header 的資訊，這邊主要是紀錄 long-running 的請求(如 watch)
ResponseComplete - 當 response body 已經完成，並且無任何資訊傳輸
Panic - 程式遇到無法處理的嚴重錯誤而緊急停止執行的狀況

注意
Audit log 的開啟後因為需要儲存請求的內容，因此會消耗 API server 的 memory

Audit policy

Audit policy 主要提供設定要對哪些事件做紀錄，並且會針對第一個吻合的 rule 就會被記錄下來，不會再繼續檢查後面的規則。
紀錄的形式可以是

None: 當吻合時不紀錄任何東西
Metadata: 紀錄 event 的 metadata 包含 requesting user, timestamp, resource, verb...，但是不包含 request 或者 response body
Request: 僅紀錄 Request 的 metadata 以及 body ，不適用於 non-resource requests
RequestResponse: 紀錄 request metadata, request body 以及 response body ，同樣地不適用於 non-resource requests

policy 儲存位置可以透過在kube-apiserver 中設定 flag -audit-policy-file

主要結構如下

apiVersion: audit.k8s.io/v1 
kind: Policy
omitStages: 
  - "RequestReceived"                    # 全域設定忽略所有RequestReceived的內容
rules:
  - level: RequestResponse               # 可以是None, Metadata, Request, RequestResponse
    users: ["system:kube-proxy"]         # 針對特定 user 紀錄
    userGroups: ["system:authenticated"] # 針對特定 userGroup 紀錄
    verbs: ["watch"]                     # 針對特定行為 verb 
    namespaces: ["kube-system"]          # namespace
    resources:
    - group: ""                          # 根據 core API group 的分類
      resources: ["pods"]                # 也可以是 "pods/log", "pods/status"

policy 範例

apiVersion: audit.k8s.io/v1 # This is required.
kind: Policy
# Don't generate audit events for all requests in RequestReceived stage.
omitStages: # 忽略所有RequestReceived的內容
  - "RequestReceived"
rules:
  # Log pod changes at RequestResponse level
  - level: RequestResponse
    resources:
    - group: ""
      # Resource "pods" doesn't match requests to any subresource of pods,
      # which is consistent with the RBAC policy.
      resources: ["pods"]
  # Log "pods/log", "pods/status" at Metadata level
  - level: Metadata
    resources:
    - group: ""
      resources: ["pods/log", "pods/status"]

  # Don't log requests to a configmap called "controller-leader"
  - level: None
    resources:
    - group: ""
      resources: ["configmaps"]
      resourceNames: ["controller-leader"]

  # Don't log watch requests by the "system:kube-proxy" on endpoints or services
  - level: None
    users: ["system:kube-proxy"]
    verbs: ["watch"]
    resources:
    - group: "" # core API group
      resources: ["endpoints", "services"]

  # Don't log authenticated requests to certain non-resource URL paths.
  - level: None
    userGroups: ["system:authenticated"]
    nonResourceURLs:
    - "/api*" # Wildcard matching.
    - "/version"

  # Log the request body of configmap changes in kube-system.
  - level: Request
    resources:
    - group: "" # core API group
      resources: ["configmaps"]
    # This rule only applies to resources in the "kube-system" namespace.
    # The empty string "" can be used to select non-namespaced resources.
    namespaces: ["kube-system"]

  # Log configmap and secret changes in all other namespaces at the Metadata level.
  - level: Metadata
    resources:
    - group: "" # core API group
      resources: ["secrets", "configmaps"]

  # Log all other resources in core and extensions at the Request level.
  - level: Request
    resources:
    - group: "" # core API group
    - group: "extensions" # Version of group should NOT be included.

  # A catch-all rule to log all other requests at the Metadata level.
  - level: Metadata
    # Long-running requests like watches that fall under this rule will not
    # generate an audit event in RequestReceived.
    omitStages:
      - "RequestReceived"

Audit backend

在我們寫完規則後再來就是可以將 log 存放於何處，這邊有兩個選項

Log backend: 將 log 寫入檔案中
Webhook backend: 將 log 寫入至外部的 HTTP API 中

輸出格式注意 Patch 行為所輸出的格式會跟其他 api 事件輸出會有所不同，patch 像是 kubectl scale, kubectl edit

[
  {
    "op": "replace",
    "path": "/spec/parallelism",
    "value": 0
  },
  {
    "op": "remove",
    "path": "/spec/template/spec/containers/0/terminationMessagePolicy"
  }
]

Log backend

Log backend 可以設定

--audit-log-path： log 指定 log 存至哪，如果沒有特別指定路徑會被視為使用 standard out
--audit-log-maxage：最多可儲存時間
--audit-log-maxbackup：最多可存多少檔案
--audit-log-maxsize：單一檔案最大的大小

如果 kube-apiserver 是以 pod 形式建立，除了指定這些配置外還需要額外設定 mount volume

例如：

  - --audit-policy-file=/etc/kubernetes/audit-policy.yaml
  - --audit-log-path=/var/log/kubernetes/audit/audit.log

需要額外設定其 volume path

volumeMounts:
  - mountPath: /etc/kubernetes/audit-policy.yaml
    name: audit
    readOnly: true
  - mountPath: /var/log/kubernetes/audit/
    name: audit-log
    readOnly: false
volumes:
- name: audit
  hostPath:
    path: /etc/kubernetes/audit-policy.yaml
    type: File

- name: audit-log
  hostPath:
    path: /var/log/kubernetes/audit/
    type: DirectoryOrCreate