到這裡,我們已經把應用寫完、包好、測過、丟進乾淨的容器。接下來就是你最不想面對但跑不掉的一題:怎麼把它穩定、可觀測、能回滾地丟上線。
今天做三件事:
風格與原則延續前文:JSON stdout 日誌、唯讀根檔系統、依賴鎖定、非 root、可回滾。
uvicorn
啟動範例:
uvicorn my_project.adapters.web.app:app \
--host 0.0.0.0 --port 8000 \
--proxy-headers --forwarded-allow-ips='*' \
--timeout-keep-alive 5
gunicorn
搭 UvicornWorker
啟動範例:
gunicorn my_project.adapters.web.app:app \
-k uvicorn.workers.UvicornWorker \
--bind 0.0.0.0:8000 \
--workers ${WORKERS:-$(python - <<'PY'
import os, multiprocessing as m
print(max(2, m.cpu_count()*2 + 1))
PY
)} \
--threads ${THREADS:-1} \
--timeout 60 --graceful-timeout 30 \
--keep-alive 5 --access-logfile '-' --error-logfile '-'
選擇:
Cloud Run:傾向直接 uvicorn
,因為「多實例」由平台處理
K8s/VM:偏好 gunicorn + UvicornWorker
,多進程彈性更大
CPU 密集請外接任務系統或把熱點換到 C/Rust;Web 層再怎麼調 workers 也救不了錯刀口
沿用多階段 Dockerfile:builder 解決依賴,runtime 超瘦、非 root、stdout JSON log。
把啟動命令抽成環境變數,方便在 Cloud Run 或 K8s 覆寫:
ENV APP_CMD="uvicorn my_project.adapters.web.app:app --host 0.0.0.0 --port 8000"
CMD ["sh", "-c", "$APP_CMD"]
需要 gunicorn
時只要在部署層改:
APP_CMD="gunicorn my_project.adapters.web.app:app -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000 --workers 3"
Cloud Run 幫你處理自動擴縮、TLS、日誌聚合。你要負責的只有可預測的資源與正確的併發。
推薦基線:
-cpu=1 --memory=512Mi
(依框架與流量再拉)-concurrency=40
(FastAPI I/O 密集通常能撐,先量測)-min-instances=0~1
(冷啟動 vs 成本取捨)-timeout=30s
(搭配上游重試策略)/healthz
、/readyz
;啟動期間回 503部署指令範例:
gcloud run deploy awesome-api \
--source . \
--region asia-east1 \
--platform managed \
--allow-unauthenticated \
--cpu=1 --memory=512Mi \
--concurrency=40 \
--min-instances=1 --max-instances=50 \
--timeout=30s \
--set-env-vars "APP_ENV=prod" \
--set-secrets "DATABASE_URL=projects/xxx/secrets/db-url:latest" \
--set-env-vars "APP_CMD=uvicorn my_project.adapters.web.app:app --host 0.0.0.0 --port 8080"
備註:Cloud Run 預設用 8080。把 APP_CMD 的 port 改成 8080,別硬碰 8000。
實務 Tips
-cpu=1 --concurrency=80
先試,但壓測要看 95/99 百分位-min-instances=1
,然後壓測調到剛好不會把單實例打爆你自己養集群,當然就多寫一點 YAML。重點只有三件事:Requests/Limits、Probes、Autoscaling。
Deployment(節錄):
apiVersion: apps/v1
kind: Deployment
metadata:
name: awesome-api
spec:
replicas: 2
selector:
matchLabels: { app: awesome-api }
template:
metadata:
labels: { app: awesome-api }
spec:
securityContext:
runAsUser: 10001
runAsGroup: 10001
fsGroup: 10001
containers:
- name: web
image: ghcr.io/you/awesome-api:1.0.0
imagePullPolicy: IfNotPresent
env:
- name: APP_ENV
value: "prod"
- name: APP_CMD
value: >-
gunicorn my_project.adapters.web.app:app
-k uvicorn.workers.UvicornWorker
--bind 0.0.0.0:8000
--workers=3 --timeout=60 --graceful-timeout=30 --keep-alive=5
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: app-secrets
key: database_url
ports:
- containerPort: 8000
resources:
requests:
cpu: "250m"
memory: "256Mi"
limits:
cpu: "1"
memory: "512Mi"
readinessProbe:
httpGet: { path: /readyz, port: 8000 }
periodSeconds: 5
failureThreshold: 6
livenessProbe:
httpGet: { path: /healthz, port: 8000 }
periodSeconds: 10
failureThreshold: 3
volumeMounts:
- name: tmp
mountPath: /tmp
securityContext:
readOnlyRootFilesystem: true
volumes:
- name: tmp
emptyDir: {}
Service:
apiVersion: v1
kind: Service
metadata:
name: awesome-api
spec:
selector: { app: awesome-api }
ports:
- name: http
port: 80
targetPort: 8000
type: ClusterIP
HPA(CPU 觸發,先求有感):
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: awesome-api
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: awesome-api
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
實務 Tips
/readyz
,/healthz
只要證明活著Secret
+ KMS;別把 .env
打包進映像maxUnavailable=0
、maxSurge=1
起步,慢慢調FastAPI 端點:
from fastapi import APIRouter, Response, status
router = APIRouter()
@router.get("/healthz")
def healthz():
return {"ok": True}
@router.get("/readyz")
def readyz():
# 只做「快」的檢查:例如 DB 連線池 ping(有 timeout),或關鍵依賴的快取標誌
ok = True
return Response(status_code=status.HTTP_200_OK if ok else status.HTTP_503_SERVICE_UNAVAILABLE)
降級策略:外部依賴抖動時回「舊快取或預設值」而非 500;把 retry 次數、退避與 jitter 記進結構化日誌。
structlog
→ JSON → stdoutevent
、path
、status
、ms
、request_id
、version
uv sync --locked
確保依賴一致hatch build
,產出鏡像用Dockerfilepip-audit
、Safety
、CycloneDX SBOM
進 artifactsgcloud run deploy
(分環境用不同專案或不同 service 名稱)kubectl apply -k overlays/prod
,把 APP_CMD
之類的差異放 Kustomize overlay場景 | 啟動方式 | 併發與擴縮 | 你要操心的 |
---|---|---|---|
Cloud Run API | uvicorn |
平台自動擴縮,設 --concurrency |
超時、最小實例、成本、冷啟 |
K8s 小中型流量 | gunicorn + UvicornWorker |
HPA + Requests/Limits | Probes、滾動更新、日誌聚合 |
重 I/O、少 CPU | 任一皆可 | 拉高單實例併發 | 外部依賴重試與降級 |
重 CPU | 不要硬撐 Web | 把 CPU 料丟背景任務 | 熱點剖析、快取 |
concurrency
太高導致排隊;或冷啟,開 min-instances
print()
。把 logger 注入中介層並在 CI lint 禁用裸 print
.env
打包進映像。移除,改 Secrets 注入,並審 .dockerignore
pyproject.toml
片段:
[project.optional-dependencies]
deploy = ["uvicorn>=0.30", "gunicorn>=22.0"]
[tool.hatch.envs.deploy]
features = ["deploy"]
[tool.hatch.envs.deploy.scripts]
serve-uv = "uvicorn my_project.adapters.web.app:app --host 0.0.0.0 --port 8000"
serve-gunicorn = "gunicorn my_project.adapters.web.app:app -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000 --workers 3 --timeout 60 --graceful-timeout 30 --keep-alive 5"
health = "curl -sf http://127.0.0.1:8000/healthz && echo ok"
ready = "curl -sf http://127.0.0.1:8000/readyz && echo ready"
部署不該是「神秘學」。選一條你能量測、能回滾、能看見的路;平台要嘛幫你管進程與縮放(Cloud Run),要嘛你自己把進程治理做好(Gunicorn + K8s)。其餘都是細節,細節我們今天都寫給你了。去壓測,把門檻拉到剛好,讓錢花在用戶身上,而不是 CPU 放空的時間裡。