昨天我們認識了 GitHub 這個數位倉庫,今天要學習 GitHub Runner - 一個自動化的田間管理員。就像爸爸以前要親自到田裡檢查作物、澆水、施肥,現在我們有了 GitHub Runner,可以自動執行部署到生產環境的所有工作!
GitHub Runner 是執行 GitHub Actions 工作流程的計算環境:
# .github/workflows/deploy-prod.yml
name: Deploy to Production
on:
push:
branches: [ main ]
workflow_dispatch: # 允許手動觸發
env:
AWS_REGION: us-east-1
ECR_REPOSITORY: trading-bot
ECS_SERVICE: trading-bot-service
ECS_CLUSTER: trading-cluster
CONTAINER_NAME: trading-bot
jobs:
# 預檢階段
pre-checks:
runs-on: ubuntu-latest
outputs:
should-deploy: ${{ steps.check.outputs.should-deploy }}
steps:
- name: Checkout code
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Check if deployment needed
id: check
run: |
# 檢查是否有重要檔案變更
CHANGED_FILES=$(git diff --name-only HEAD^ HEAD)
if echo "$CHANGED_FILES" | grep -E "(src/|Dockerfile|requirements.txt)"; then
echo "should-deploy=true" >> $GITHUB_OUTPUT
else
echo "should-deploy=false" >> $GITHUB_OUTPUT
fi
- name: Check deployment conditions
run: |
# 檢查是否在交易時間外
CURRENT_HOUR=$(date +%H)
if [ $CURRENT_HOUR -ge 9 ] && [ $CURRENT_HOUR -le 16 ]; then
echo "⚠️ 正在交易時間,請謹慎部署"
fi
# 安全檢查
security-scan:
runs-on: ubuntu-latest
needs: pre-checks
if: needs.pre-checks.outputs.should-deploy == 'true'
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Run Trivy vulnerability scanner
uses: aquasecurity/trivy-action@master
with:
scan-type: 'fs'
scan-ref: '.'
format: 'sarif'
output: 'trivy-results.sarif'
- name: Upload Trivy scan results
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: 'trivy-results.sarif'
# 建置和部署
deploy:
runs-on: ubuntu-latest
needs: [pre-checks, security-scan]
if: needs.pre-checks.outputs.should-deploy == 'true'
environment: production
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v2
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ env.AWS_REGION }}
role-to-assume: ${{ secrets.AWS_DEPLOY_ROLE_ARN }}
role-session-name: GitHubActions-Deploy
- name: Login to Amazon ECR
id: login-ecr
uses: aws-actions/amazon-ecr-login@v1
- name: Extract metadata
id: meta
uses: docker/metadata-action@v4
with:
images: ${{ steps.login-ecr.outputs.registry }}/${{ env.ECR_REPOSITORY }}
tags: |
type=ref,event=branch
type=sha,prefix={{branch}}-
type=raw,value=latest,enable={{is_default_branch}}
- name: Build and push Docker image
uses: docker/build-push-action@v4
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Download task definition
run: |
aws ecs describe-task-definition \
--task-definition ${{ env.ECS_SERVICE }} \
--query taskDefinition > task-definition.json
- name: Update task definition
id: task-def
uses: aws-actions/amazon-ecs-render-task-definition@v1
with:
task-definition: task-definition.json
container-name: ${{ env.CONTAINER_NAME }}
image: ${{ steps.login-ecr.outputs.registry }}/${{ env.ECR_REPOSITORY }}:${{ github.sha }}
- name: Deploy to ECS
uses: aws-actions/amazon-ecs-deploy-task-definition@v1
with:
task-definition: ${{ steps.task-def.outputs.task-definition }}
service: ${{ env.ECS_SERVICE }}
cluster: ${{ env.ECS_CLUSTER }}
wait-for-service-stability: true
wait-for-minutes: 10
- name: Verify deployment
run: |
# 等待服務穩定
sleep 30
# 獲取 ALB DNS 名稱
ALB_DNS=$(aws elbv2 describe-load-balancers \
--names trading-bot-alb \
--query 'LoadBalancers[0].DNSName' \
--output text)
# 健康檢查
for i in {1..10}; do
if curl -f "http://$ALB_DNS/health"; then
echo "✅ 部署驗證成功"
exit 0
fi
echo "等待服務啟動... ($i/10)"
sleep 30
done
echo "❌ 部署驗證失敗"
exit 1
# 部署後測試
post-deployment-tests:
runs-on: ubuntu-latest
needs: deploy
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Run integration tests
run: |
# 設定測試環境變數
export API_BASE_URL="https://api.trading.example.com"
export TEST_API_KEY="${{ secrets.TEST_API_KEY }}"
# 執行整合測試
python -m pytest tests/integration/ -v
- name: Performance test
run: |
# 簡單的效能測試
echo "執行效能測試..."
curl -w "回應時間: %{time_total}s\n" \
-o /dev/null -s \
"https://api.trading.example.com/health"
# 通知
notify:
runs-on: ubuntu-latest
needs: [deploy, post-deployment-tests]
if: always()
steps:
- name: Notify Telegram on success
if: ${{ needs.deploy.result == 'success' && needs.post-deployment-tests.result == 'success' }}
run: |
curl -X POST "https://api.telegram.org/bot${{ secrets.TELEGRAM_BOT_TOKEN }}/sendMessage" \
-H "Content-Type: application/json" \
-d '{
"chat_id": "${{ secrets.TELEGRAM_CHAT_ID }}",
"text": "🎉 生產環境部署成功!\n\n📦 版本: ${{ github.sha }}\n🕒 時間: $(date)\n🔗 [查看詳情](https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }})"
}'
- name: Notify Telegram on failure
if: ${{ needs.deploy.result == 'failure' || needs.post-deployment-tests.result == 'failure' }}
run: |
curl -X POST "https://api.telegram.org/bot${{ secrets.TELEGRAM_BOT_TOKEN }}/sendMessage" \
-H "Content-Type: application/json" \
-d '{
"chat_id": "${{ secrets.TELEGRAM_CHAT_ID }}",
"text": "❌ 生產環境部署失敗!\n\n📦 版本: ${{ github.sha }}\n🕒 時間: $(date)\n🔗 [查看詳情](https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }})"
}'
# 在 GitHub Repository Settings > Environments 中設定
environment:
name: production
protection_rules:
required_reviewers:
- "senior-developer-team"
wait_timer: 5 # 等待 5 分鐘
deployment_branch_policy:
protected_branches: true
custom_branch_policies: false
# .github/workflows/blue-green-deploy.yml
name: Blue-Green Deployment
on:
workflow_dispatch:
inputs:
environment:
description: 'Target environment'
required: true
default: 'production'
type: choice
options:
- production
- staging
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Determine current environment
id: current-env
run: |
# 檢查目前運行的是藍色還是綠色環境
CURRENT_COLOR=$(aws elbv2 describe-target-groups \
--load-balancer-arn ${{ secrets.ALB_ARN }} \
--query 'TargetGroups[?contains(TargetGroupName, `blue`)].TargetGroupName' \
--output text)
if [ -n "$CURRENT_COLOR" ]; then
echo "current=blue" >> $GITHUB_OUTPUT
echo "target=green" >> $GITHUB_OUTPUT
else
echo "current=green" >> $GITHUB_OUTPUT
echo "target=blue" >> $GITHUB_OUTPUT
fi
- name: Deploy to target environment
run: |
TARGET_COLOR=${{ steps.current-env.outputs.target }}
# 更新目標環境的 ECS 服務
aws ecs update-service \
--cluster trading-cluster \
--service trading-bot-$TARGET_COLOR \
--task-definition trading-bot-task:${{ github.run_number }}
# 等待服務穩定
aws ecs wait services-stable \
--cluster trading-cluster \
--services trading-bot-$TARGET_COLOR
- name: Run smoke tests
run: |
TARGET_COLOR=${{ steps.current-env.outputs.target }}
# 對新環境執行煙霧測試
python scripts/smoke_tests.py \
--target-group trading-bot-$TARGET_COLOR \
--timeout 300
- name: Switch traffic
run: |
TARGET_COLOR=${{ steps.current-env.outputs.target }}
# 切換 ALB 流量到新環境
aws elbv2 modify-listener \
--listener-arn ${{ secrets.ALB_LISTENER_ARN }} \
--default-actions Type=forward,TargetGroupArn=${{ secrets.TARGET_GROUP_ARN_PREFIX }}$TARGET_COLOR
echo "✅ 流量已切換到 $TARGET_COLOR 環境"
# .github/workflows/auto-rollback.yml
name: Auto Rollback
on:
workflow_run:
workflows: ["Deploy to Production"]
types:
- completed
jobs:
rollback:
runs-on: ubuntu-latest
if: ${{ github.event.workflow_run.conclusion == 'failure' }}
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v2
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1
- name: Get previous stable version
id: previous-version
run: |
# 獲取前一個穩定版本
PREVIOUS_TASK_DEF=$(aws ecs describe-services \
--cluster trading-cluster \
--services trading-bot-service \
--query 'services[0].taskDefinition' \
--output text)
# 獲取前一個版本的 revision
CURRENT_REVISION=$(echo $PREVIOUS_TASK_DEF | grep -o '[0-9]*$')
PREVIOUS_REVISION=$((CURRENT_REVISION - 1))
echo "previous-revision=$PREVIOUS_REVISION" >> $GITHUB_OUTPUT
- name: Rollback to previous version
run: |
echo "🔄 執行自動回滾到版本 ${{ steps.previous-version.outputs.previous-revision }}"
aws ecs update-service \
--cluster trading-cluster \
--service trading-bot-service \
--task-definition trading-bot-task:${{ steps.previous-version.outputs.previous-revision }}
# 等待回滾完成
aws ecs wait services-stable \
--cluster trading-cluster \
--services trading-bot-service
- name: Verify rollback
run: |
# 驗證回滾是否成功
sleep 60
ALB_DNS=$(aws elbv2 describe-load-balancers \
--names trading-bot-alb \
--query 'LoadBalancers[0].DNSName' \
--output text)
if curl -f "http://$ALB_DNS/health"; then
echo "✅ 回滾驗證成功"
else
echo "❌ 回滾驗證失敗"
exit 1
fi
- name: Notify rollback
run: |
curl -X POST "https://api.telegram.org/bot${{ secrets.TELEGRAM_BOT_TOKEN }}/sendMessage" \
-H "Content-Type: application/json" \
-d '{
"chat_id": "${{ secrets.TELEGRAM_CHAT_ID }}",
"text": "🔄 自動回滾已執行\n\n📦 回滾到版本: ${{ steps.previous-version.outputs.previous-revision }}\n🕒 時間: $(date)\n❗ 原因: 部署失敗"
}'
# scripts/deployment_monitor.py
import boto3
import time
import requests
from datetime import datetime, timedelta
class DeploymentMonitor:
def __init__(self, cluster_name, service_name, alb_dns):
self.ecs = boto3.client('ecs')
self.cloudwatch = boto3.client('cloudwatch')
self.cluster_name = cluster_name
self.service_name = service_name
self.alb_dns = alb_dns
def check_service_health(self):
"""檢查 ECS 服務健康狀態"""
try:
response = self.ecs.describe_services(
cluster=self.cluster_name,
services=[self.service_name]
)
service = response['services'][0]
running_count = service['runningCount']
desired_count = service['desiredCount']
return running_count == desired_count
except Exception as e:
print(f"服務健康檢查失敗: {e}")
return False
def check_endpoint_health(self):
"""檢查應用程式端點健康狀態"""
try:
response = requests.get(f"http://{self.alb_dns}/health", timeout=10)
return response.status_code == 200
except Exception as e:
print(f"端點健康檢查失敗: {e}")
return False
def monitor_deployment(self, timeout=600):
"""監控部署過程"""
start_time = datetime.now()
while datetime.now() - start_time < timedelta(seconds=timeout):
service_healthy = self.check_service_health()
endpoint_healthy = self.check_endpoint_health()
if service_healthy and endpoint_healthy:
print("✅ 部署監控通過")
return True
print(f"⏳ 等待部署完成... 服務健康: {service_healthy}, 端點健康: {endpoint_healthy}")
time.sleep(30)
print("❌ 部署監控超時")
return False
if __name__ == "__main__":
import sys
monitor = DeploymentMonitor(
cluster_name=sys.argv[1],
service_name=sys.argv[2],
alb_dns=sys.argv[3]
)
success = monitor.monitor_deployment()
sys.exit(0 if success else 1)
# 使用快取加速建置
- name: Cache Docker layers
uses: actions/cache@v3
with:
path: /tmp/.buildx-cache
key: ${{ runner.os }}-buildx-${{ github.sha }}
restore-keys: |
${{ runner.os }}-buildx-
# 並行執行多個工作
jobs:
test:
strategy:
matrix:
python-version: [3.8, 3.9, 3.10]
runs-on: ubuntu-latest
steps:
- name: Test with Python ${{ matrix.python-version }}
run: pytest
今天我們學習了如何設定 GitHub Runner 來自動部署到生產環境,這就像是為農場配備了一個全自動的管理系統。重要的概念包括:
明天我們將學習更詳細的 GitHub Actions CI/CD 配置,包括更複雜的工作流程設計!
下一篇:Day 13 - Github Actions CI/CD