在過去的 Day 24-28 中,我們建構了 Kyo System 在 AWS 上的完整基礎設施:
今天是倒數第二天,我們將完成最後的拼圖:完整的 CI/CD Pipeline,讓 Kyo System 達到自動化部署、零停機更新的生產就緒狀態。
┌─────────────────────────────────────────────────────────────────┐
│ GitHub Actions │
│ CI/CD Pipeline (Build, Test, Deploy) │
└────────┬─────────────────────────────────────────────────┬──────┘
│ │
│ Push Images │ Deploy
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ ECR │ │ AWS CDK │
│ - Docker │ │ - Synth │
│ Images │ │ - Deploy │
└────────┬────────┘ └────────┬────────┘
│ │
│ Pull Images │ Create/Update
▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ AWS Services │
├─────────────────┬──────────────────┬──────────────────┬─────────┤
│ ECS Fargate │ Application │ RDS │ Redis │
│ - Services │ Load Balancer │ - PostgreSQL │ Cache │
│ - Tasks │ - Target Grps │ - Multi-AZ │ │
└─────────────────┴──────────────────┴──────────────────┴─────────┘
│ │ │ │
└────────────────────┴──────────────────┴───────────┘
│
▼
┌───────────────────────┐
│ CloudWatch │
│ - Logs │
│ - Metrics │
│ - Alarms │
└───────────────────────┘
建立多環境部署的 GitHub Actions workflow:
# .github/workflows/deploy.yml
name: CI/CD Pipeline
on:
push:
branches:
- main # Production
- develop # Staging
pull_request:
branches:
- main
- develop
env:
AWS_REGION: ap-northeast-1
ECR_REPOSITORY: kyo-otp-service
NODE_VERSION: '20.x'
jobs:
# Job 1: Code Quality & Testing
test:
name: Test & Lint
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
- name: Setup pnpm
uses: pnpm/action-setup@v2
with:
version: 8
- name: Get pnpm store directory
id: pnpm-cache
shell: bash
run: |
echo "STORE_PATH=$(pnpm store path)" >> $GITHUB_OUTPUT
- name: Setup pnpm cache
uses: actions/cache@v3
with:
path: ${{ steps.pnpm-cache.outputs.STORE_PATH }}
key: ${{ runner.os }}-pnpm-store-${{ hashFiles('**/pnpm-lock.yaml') }}
restore-keys: |
${{ runner.os }}-pnpm-store-
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Lint
run: pnpm run lint
- name: Type check
run: pnpm run type-check
- name: Unit tests
run: pnpm run test:unit
- name: Integration tests
run: pnpm run test:integration
env:
DATABASE_URL: postgresql://test:test@localhost:5432/test
REDIS_URL: redis://localhost:6379
- name: Upload coverage
uses: codecov/codecov-action@v3
with:
files: ./coverage/coverage-final.json
flags: unittests
name: codecov-umbrella
# Job 2: Security Scan
security:
name: Security Scan
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Run npm audit
run: npm audit --production --audit-level=high
continue-on-error: true
- name: Run Trivy vulnerability scanner
uses: aquasecurity/trivy-action@master
with:
scan-type: 'fs'
scan-ref: '.'
format: 'sarif'
output: 'trivy-results.sarif'
- name: Upload Trivy results to GitHub Security
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: 'trivy-results.sarif'
# Job 3: Build Docker Images
build:
name: Build & Push Docker Images
runs-on: ubuntu-latest
needs: [test, security]
if: github.event_name == 'push'
timeout-minutes: 20
outputs:
image-tag: ${{ steps.meta.outputs.tags }}
image-digest: ${{ steps.build.outputs.digest }}
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ env.AWS_REGION }}
- name: Login to Amazon ECR
id: login-ecr
uses: aws-actions/amazon-ecr-login@v2
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
images: ${{ steps.login-ecr.outputs.registry }}/${{ env.ECR_REPOSITORY }}
tags: |
type=ref,event=branch
type=sha,prefix={{branch}}-
type=semver,pattern={{version}}
type=raw,value=latest,enable={{is_default_branch}}
- name: Build and push Docker image
id: build
uses: docker/build-push-action@v5
with:
context: .
file: ./apps/kyo-otp-service/Dockerfile
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
build-args: |
NODE_ENV=production
BUILD_DATE=${{ github.event.head_commit.timestamp }}
VCS_REF=${{ github.sha }}
- name: Scan image with Trivy
uses: aquasecurity/trivy-action@master
with:
image-ref: ${{ steps.login-ecr.outputs.registry }}/${{ env.ECR_REPOSITORY }}:${{ github.sha }}
format: 'sarif'
output: 'trivy-image-results.sarif'
- name: Upload Trivy scan results
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: 'trivy-image-results.sarif'
# Job 4: Database Migration
migrate:
name: Run Database Migrations
runs-on: ubuntu-latest
needs: [build]
if: github.event_name == 'push'
timeout-minutes: 10
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ env.AWS_REGION }}
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
- name: Setup pnpm
uses: pnpm/action-setup@v2
with:
version: 8
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Get database credentials from Secrets Manager
id: db-secret
run: |
SECRET=$(aws secretsmanager get-secret-value \
--secret-id kyo-system/database/credentials \
--query SecretString --output text)
echo "::add-mask::$(echo $SECRET | jq -r .password)"
echo "DATABASE_URL=postgresql://$(echo $SECRET | jq -r .username):$(echo $SECRET | jq -r .password)@$(echo $SECRET | jq -r .host):$(echo $SECRET | jq -r .port)/$(echo $SECRET | jq -r .dbname)" >> $GITHUB_ENV
- name: Run migrations
run: pnpm --filter kyo-otp-service run migrate:up
env:
DATABASE_URL: ${{ env.DATABASE_URL }}
- name: Verify migration
run: pnpm --filter kyo-otp-service run migrate:status
env:
DATABASE_URL: ${{ env.DATABASE_URL }}
# Job 5: Deploy to Staging
deploy-staging:
name: Deploy to Staging
runs-on: ubuntu-latest
needs: [build, migrate]
if: github.ref == 'refs/heads/develop' && github.event_name == 'push'
environment:
name: staging
url: https://staging.kyo-system.com
timeout-minutes: 15
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ env.AWS_REGION }}
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
- name: Install CDK
run: npm install -g aws-cdk
- name: Setup pnpm
uses: pnpm/action-setup@v2
with:
version: 8
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Build CDK
run: pnpm --filter kyo-infrastructure run build
- name: CDK Synth
run: pnpm --filter kyo-infrastructure run cdk synth
env:
ENVIRONMENT: staging
IMAGE_TAG: ${{ github.sha }}
- name: CDK Deploy
run: pnpm --filter kyo-infrastructure run cdk deploy --all --require-approval never
env:
ENVIRONMENT: staging
IMAGE_TAG: ${{ github.sha }}
- name: Wait for deployment to stabilize
run: |
aws ecs wait services-stable \
--cluster kyo-staging-cluster \
--services kyo-otp-service
- name: Run smoke tests
run: pnpm run test:smoke
env:
API_URL: https://api.staging.kyo-system.com
# Job 6: Deploy to Production
deploy-production:
name: Deploy to Production
runs-on: ubuntu-latest
needs: [build, migrate]
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
environment:
name: production
url: https://kyo-system.com
timeout-minutes: 20
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ env.AWS_REGION }}
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
- name: Install CDK
run: npm install -g aws-cdk
- name: Setup pnpm
uses: pnpm/action-setup@v2
with:
version: 8
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Build CDK
run: pnpm --filter kyo-infrastructure run build
- name: CDK Synth
run: pnpm --filter kyo-infrastructure run cdk synth
env:
ENVIRONMENT: production
IMAGE_TAG: ${{ github.sha }}
- name: CDK Deploy (Blue-Green)
run: pnpm --filter kyo-infrastructure run cdk deploy --all --require-approval never
env:
ENVIRONMENT: production
IMAGE_TAG: ${{ github.sha }}
DEPLOYMENT_TYPE: blue-green
- name: Wait for deployment to stabilize
run: |
aws ecs wait services-stable \
--cluster kyo-production-cluster \
--services kyo-otp-service
- name: Run smoke tests
id: smoke-tests
run: pnpm run test:smoke
env:
API_URL: https://api.kyo-system.com
continue-on-error: true
- name: Rollback on failure
if: steps.smoke-tests.outcome == 'failure'
run: |
echo "Smoke tests failed, initiating rollback..."
pnpm --filter kyo-infrastructure run cdk deploy --all --require-approval never
env:
ENVIRONMENT: production
IMAGE_TAG: ${{ needs.build.outputs.previous-image-tag }}
- name: Notify deployment
uses: 8398a7/action-slack@v3
with:
status: ${{ job.status }}
text: 'Production deployment ${{ job.status }}'
webhook_url: ${{ secrets.SLACK_WEBHOOK }}
if: always()
# Job 7: Performance Test
performance:
name: Performance Tests
runs-on: ubuntu-latest
needs: [deploy-staging]
if: github.ref == 'refs/heads/develop' && github.event_name == 'push'
timeout-minutes: 10
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Setup k6
run: |
sudo gpg -k
sudo gpg --no-default-keyring --keyring /usr/share/keyrings/k6-archive-keyring.gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69
echo "deb [signed-by=/usr/share/keyrings/k6-archive-keyring.gpg] https://dl.k6.io/deb stable main" | sudo tee /etc/apt/sources.list.d/k6.list
sudo apt-get update
sudo apt-get install k6
- name: Run load tests
run: k6 run test/load/otp-load-test.js
env:
BASE_URL: https://api.staging.kyo-system.com
JWT_TOKEN: ${{ secrets.STAGING_JWT_TOKEN }}
- name: Upload results
uses: actions/upload-artifact@v3
with:
name: k6-results
path: k6-results.json
建立優化的 Docker 映像構建流程:
# apps/kyo-otp-service/Dockerfile
# Stage 1: Base
FROM node:20-alpine AS base
# Install pnpm
RUN corepack enable && corepack prepare pnpm@8.15.0 --activate
# Set working directory
WORKDIR /app
# Copy workspace configuration
COPY pnpm-workspace.yaml package.json pnpm-lock.yaml ./
# Stage 2: Dependencies
FROM base AS dependencies
# Copy package.json files for all workspaces
COPY packages/kyo-core/package.json ./packages/kyo-core/
COPY packages/kyo-types/package.json ./packages/kyo-types/
COPY packages/kyo-config/package.json ./packages/kyo-config/
COPY apps/kyo-otp-service/package.json ./apps/kyo-otp-service/
# Install dependencies (including devDependencies for build)
RUN pnpm install --frozen-lockfile
# Stage 3: Build
FROM dependencies AS build
# Copy source code
COPY packages/kyo-core ./packages/kyo-core
COPY packages/kyo-types ./packages/kyo-types
COPY packages/kyo-config ./packages/kyo-config
COPY apps/kyo-otp-service ./apps/kyo-otp-service
# Copy shared config files
COPY tsconfig.json ./
COPY turbo.json ./
# Build packages
RUN pnpm --filter @kyong/kyo-types run build
RUN pnpm --filter @kyong/kyo-config run build
RUN pnpm --filter @kyong/kyo-core run build
RUN pnpm --filter kyo-otp-service run build
# Stage 4: Production Dependencies
FROM base AS prod-dependencies
# Copy package files
COPY packages/kyo-core/package.json ./packages/kyo-core/
COPY packages/kyo-types/package.json ./packages/kyo-types/
COPY packages/kyo-config/package.json ./packages/kyo-config/
COPY apps/kyo-otp-service/package.json ./apps/kyo-otp-service/
# Install production dependencies only
RUN pnpm install --prod --frozen-lockfile
# Stage 5: Runtime
FROM node:20-alpine AS runtime
# Install dumb-init for proper signal handling
RUN apk add --no-cache dumb-init
# Create non-root user
RUN addgroup -g 1001 -S nodejs && \
adduser -S nodejs -u 1001
# Set working directory
WORKDIR /app
# Copy production dependencies
COPY --from=prod-dependencies --chown=nodejs:nodejs /app/node_modules ./node_modules
COPY --from=prod-dependencies --chown=nodejs:nodejs /app/packages ./packages
COPY --from=prod-dependencies --chown=nodejs:nodejs /app/apps/kyo-otp-service/node_modules ./apps/kyo-otp-service/node_modules
# Copy built application
COPY --from=build --chown=nodejs:nodejs /app/packages/kyo-core/dist ./packages/kyo-core/dist
COPY --from=build --chown=nodejs:nodejs /app/packages/kyo-types/dist ./packages/kyo-types/dist
COPY --from=build --chown=nodejs:nodejs /app/packages/kyo-config/dist ./packages/kyo-config/dist
COPY --from=build --chown=nodejs:nodejs /app/apps/kyo-otp-service/dist ./apps/kyo-otp-service/dist
# Copy package.json files
COPY --chown=nodejs:nodejs packages/kyo-core/package.json ./packages/kyo-core/
COPY --chown=nodejs:nodejs packages/kyo-types/package.json ./packages/kyo-types/
COPY --chown=nodejs:nodejs packages/kyo-config/package.json ./packages/kyo-config/
COPY --chown=nodejs:nodejs apps/kyo-otp-service/package.json ./apps/kyo-otp-service/
# Set environment
ENV NODE_ENV=production \
PORT=3000
# Switch to non-root user
USER nodejs
# Expose port
EXPOSE 3000
# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD node -e "require('http').get('http://localhost:3000/health', (r) => {process.exit(r.statusCode === 200 ? 0 : 1)})"
# Use dumb-init to handle signals properly
ENTRYPOINT ["dumb-init", "--"]
# Start application
CMD ["node", "apps/kyo-otp-service/dist/index.js"]
實作零停機的藍綠部署:
// infrastructure/lib/ecs-service-stack.ts
import * as cdk from 'aws-cdk-lib';
import * as ecs from 'aws-cdk-lib/aws-ecs';
import * as ec2 from 'aws-cdk-lib/aws-ec2';
import * as elbv2 from 'aws-cdk-lib/aws-elasticloadbalancingv2';
import * as logs from 'aws-cdk-lib/aws-logs';
import * as iam from 'aws-cdk-lib/aws-iam';
import * as codedeploy from 'aws-cdk-lib/aws-codedeploy';
import { Construct } from 'constructs';
export interface EcsServiceStackProps extends cdk.StackProps {
vpc: ec2.IVpc;
cluster: ecs.ICluster;
imageTag: string;
environment: 'staging' | 'production';
dbSecretArn: string;
redisEndpoint: string;
}
export class EcsServiceStack extends cdk.Stack {
public readonly service: ecs.FargateService;
constructor(scope: Construct, id: string, props: EcsServiceStackProps) {
super(scope, id, props);
const { vpc, cluster, imageTag, environment, dbSecretArn, redisEndpoint } = props;
// Task Definition
const taskDefinition = new ecs.FargateTaskDefinition(this, 'TaskDef', {
memoryLimitMiB: 1024,
cpu: 512,
runtimePlatform: {
cpuArchitecture: ecs.CpuArchitecture.ARM64,
operatingSystemFamily: ecs.OperatingSystemFamily.LINUX,
},
});
// Grant access to Secrets Manager
const dbSecret = cdk.aws_secretsmanager.Secret.fromSecretCompleteArn(
this,
'DBSecret',
dbSecretArn
);
dbSecret.grantRead(taskDefinition.taskRole);
// CloudWatch Log Group
const logGroup = new logs.LogGroup(this, 'ServiceLogGroup', {
logGroupName: `/ecs/kyo-otp-service-${environment}`,
retention: logs.RetentionDays.ONE_MONTH,
removalPolicy: cdk.RemovalPolicy.DESTROY,
});
// Container
const container = taskDefinition.addContainer('app', {
image: ecs.ContainerImage.fromRegistry(
`${cdk.Aws.ACCOUNT_ID}.dkr.ecr.${cdk.Aws.REGION}.amazonaws.com/kyo-otp-service:${imageTag}`
),
logging: ecs.LogDrivers.awsLogs({
streamPrefix: 'ecs',
logGroup,
}),
environment: {
NODE_ENV: 'production',
ENVIRONMENT: environment,
REDIS_URL: `redis://${redisEndpoint}:6379`,
PORT: '3000',
},
secrets: {
DATABASE_URL: ecs.Secret.fromSecretsManager(dbSecret, 'connectionString'),
JWT_SECRET: ecs.Secret.fromSecretsManager(dbSecret, 'jwtSecret'),
},
healthCheck: {
command: [
'CMD-SHELL',
'node -e "require(\'http\').get(\'http://localhost:3000/health\', (r) => {process.exit(r.statusCode === 200 ? 0 : 1)})"',
],
interval: cdk.Duration.seconds(30),
timeout: cdk.Duration.seconds(5),
retries: 3,
startPeriod: cdk.Duration.seconds(60),
},
});
container.addPortMappings({
containerPort: 3000,
protocol: ecs.Protocol.TCP,
});
// Application Load Balancer
const alb = new elbv2.ApplicationLoadBalancer(this, 'ALB', {
vpc,
internetFacing: true,
deletionProtection: environment === 'production',
});
// Target Group - Blue
const blueTargetGroup = new elbv2.ApplicationTargetGroup(this, 'BlueTargetGroup', {
vpc,
port: 3000,
protocol: elbv2.ApplicationProtocol.HTTP,
targetType: elbv2.TargetType.IP,
healthCheck: {
path: '/health',
interval: cdk.Duration.seconds(30),
timeout: cdk.Duration.seconds(5),
healthyThresholdCount: 2,
unhealthyThresholdCount: 3,
},
deregistrationDelay: cdk.Duration.seconds(30),
});
// Target Group - Green
const greenTargetGroup = new elbv2.ApplicationTargetGroup(this, 'GreenTargetGroup', {
vpc,
port: 3000,
protocol: elbv2.ApplicationProtocol.HTTP,
targetType: elbv2.TargetType.IP,
healthCheck: {
path: '/health',
interval: cdk.Duration.seconds(30),
timeout: cdk.Duration.seconds(5),
healthyThresholdCount: 2,
unhealthyThresholdCount: 3,
},
deregistrationDelay: cdk.Duration.seconds(30),
});
// Listener - Production Traffic
const prodListener = alb.addListener('ProdListener', {
port: 80,
protocol: elbv2.ApplicationProtocol.HTTP,
defaultTargetGroups: [blueTargetGroup],
});
// Listener - Test Traffic (for validation)
const testListener = alb.addListener('TestListener', {
port: 8080,
protocol: elbv2.ApplicationProtocol.HTTP,
defaultTargetGroups: [greenTargetGroup],
});
// ECS Service
this.service = new ecs.FargateService(this, 'Service', {
cluster,
taskDefinition,
desiredCount: environment === 'production' ? 2 : 1,
minHealthyPercent: 100,
maxHealthyPercent: 200,
deploymentController: {
type: ecs.DeploymentControllerType.CODE_DEPLOY,
},
circuitBreaker: {
rollback: true,
},
enableExecuteCommand: true, // For debugging
});
// Attach to blue target group initially
this.service.attachToApplicationTargetGroup(blueTargetGroup);
// Auto Scaling
const scaling = this.service.autoScaleTaskCount({
minCapacity: environment === 'production' ? 2 : 1,
maxCapacity: environment === 'production' ? 10 : 3,
});
scaling.scaleOnCpuUtilization('CpuScaling', {
targetUtilizationPercent: 70,
scaleInCooldown: cdk.Duration.seconds(60),
scaleOutCooldown: cdk.Duration.seconds(60),
});
scaling.scaleOnMemoryUtilization('MemoryScaling', {
targetUtilizationPercent: 80,
scaleInCooldown: cdk.Duration.seconds(60),
scaleOutCooldown: cdk.Duration.seconds(60),
});
// CodeDeploy Application
const application = new codedeploy.EcsApplication(this, 'Application');
// CodeDeploy Deployment Group
const deploymentGroup = new codedeploy.EcsDeploymentGroup(this, 'DeploymentGroup', {
application,
service: this.service,
blueGreenDeploymentConfig: {
blueTargetGroup,
greenTargetGroup,
listener: prodListener,
testListener,
terminationWaitTime: cdk.Duration.minutes(5),
},
deploymentConfig: environment === 'production'
? codedeploy.EcsDeploymentConfig.CANARY_10PERCENT_5MINUTES
: codedeploy.EcsDeploymentConfig.ALL_AT_ONCE,
autoRollback: {
failedDeployment: true,
stoppedDeployment: true,
deploymentInAlarm: true,
},
});
// CloudWatch Alarms for Auto Rollback
const errorAlarm = new cdk.aws_cloudwatch.Alarm(this, 'ErrorAlarm', {
metric: alb.metricTargetResponseTime(),
threshold: 1000, // 1 second
evaluationPeriods: 2,
comparisonOperator: cdk.aws_cloudwatch.ComparisonOperator.GREATER_THAN_THRESHOLD,
});
deploymentGroup.addAlarm(errorAlarm);
// Outputs
new cdk.CfnOutput(this, 'LoadBalancerDNS', {
value: alb.loadBalancerDnsName,
description: 'Load Balancer DNS Name',
});
new cdk.CfnOutput(this, 'ServiceName', {
value: this.service.serviceName,
description: 'ECS Service Name',
});
new cdk.CfnOutput(this, 'DeploymentGroupName', {
value: deploymentGroup.deploymentGroupName,
description: 'CodeDeploy Deployment Group Name',
});
}
}
實作自動化資料庫 Migration:
// apps/kyo-otp-service/src/db/migrate.ts
import { readdir, readFile } from 'fs/promises';
import { join } from 'path';
import pg from 'pg';
const { Pool } = pg;
interface Migration {
version: number;
name: string;
sql: string;
}
class MigrationRunner {
private pool: Pool;
private migrationsDir: string;
constructor(databaseUrl: string, migrationsDir: string) {
this.pool = new Pool({
connectionString: databaseUrl,
ssl: process.env.NODE_ENV === 'production' ? { rejectUnauthorized: false } : undefined,
});
this.migrationsDir = migrationsDir;
}
/**
* Initialize migrations table
*/
async init(): Promise<void> {
await this.pool.query(`
CREATE TABLE IF NOT EXISTS migrations (
id SERIAL PRIMARY KEY,
version INTEGER UNIQUE NOT NULL,
name VARCHAR(255) NOT NULL,
executed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
`);
console.log('[Migration] Initialized migrations table');
}
/**
* Get applied migrations
*/
async getAppliedMigrations(): Promise<number[]> {
const result = await this.pool.query<{ version: number }>(
'SELECT version FROM migrations ORDER BY version ASC'
);
return result.rows.map((row) => row.version);
}
/**
* Get pending migrations
*/
async getPendingMigrations(): Promise<Migration[]> {
const appliedVersions = await this.getAppliedMigrations();
const allMigrations = await this.loadMigrations();
return allMigrations.filter((m) => !appliedVersions.includes(m.version));
}
/**
* Load all migration files
*/
private async loadMigrations(): Promise<Migration[]> {
const files = await readdir(this.migrationsDir);
const migrations: Migration[] = [];
for (const file of files) {
if (!file.endsWith('.sql')) continue;
// Parse filename: 001_create_users_table.sql
const match = file.match(/^(\d+)_(.+)\.sql$/);
if (!match) continue;
const version = parseInt(match[1], 10);
const name = match[2];
const sql = await readFile(join(this.migrationsDir, file), 'utf-8');
migrations.push({ version, name, sql });
}
return migrations.sort((a, b) => a.version - b.version);
}
/**
* Run migration
*/
private async runMigration(migration: Migration): Promise<void> {
const client = await this.pool.connect();
try {
await client.query('BEGIN');
console.log(`[Migration] Running: ${migration.version}_${migration.name}`);
// Execute migration SQL
await client.query(migration.sql);
// Record migration
await client.query(
'INSERT INTO migrations (version, name) VALUES ($1, $2)',
[migration.version, migration.name]
);
await client.query('COMMIT');
console.log(`[Migration] Completed: ${migration.version}_${migration.name}`);
} catch (error) {
await client.query('ROLLBACK');
console.error(`[Migration] Failed: ${migration.version}_${migration.name}`, error);
throw error;
} finally {
client.release();
}
}
/**
* Run all pending migrations
*/
async up(): Promise<void> {
await this.init();
const pending = await this.getPendingMigrations();
if (pending.length === 0) {
console.log('[Migration] No pending migrations');
return;
}
console.log(`[Migration] Found ${pending.length} pending migrations`);
for (const migration of pending) {
await this.runMigration(migration);
}
console.log('[Migration] All migrations completed');
}
/**
* Show migration status
*/
async status(): Promise<void> {
await this.init();
const applied = await this.getAppliedMigrations();
const all = await this.loadMigrations();
const pending = all.filter((m) => !applied.includes(m.version));
console.log('\n=== Migration Status ===\n');
console.log(`Applied: ${applied.length}`);
console.log(`Pending: ${pending.length}`);
console.log(`Total: ${all.length}\n`);
if (applied.length > 0) {
console.log('Applied Migrations:');
for (const version of applied) {
const migration = all.find((m) => m.version === version);
console.log(` ✓ ${version}_${migration?.name || 'unknown'}`);
}
}
if (pending.length > 0) {
console.log('\nPending Migrations:');
for (const migration of pending) {
console.log(` ○ ${migration.version}_${migration.name}`);
}
}
console.log('');
}
/**
* Close database connection
*/
async close(): Promise<void> {
await this.pool.end();
}
}
// CLI
const command = process.argv[2];
const databaseUrl = process.env.DATABASE_URL;
const migrationsDir = join(__dirname, 'migrations');
if (!databaseUrl) {
console.error('DATABASE_URL environment variable is required');
process.exit(1);
}
const runner = new MigrationRunner(databaseUrl, migrationsDir);
(async () => {
try {
switch (command) {
case 'up':
await runner.up();
break;
case 'status':
await runner.status();
break;
default:
console.log('Usage: node migrate.js [up|status]');
process.exit(1);
}
} catch (error) {
console.error('[Migration] Error:', error);
process.exit(1);
} finally {
await runner.close();
}
})();
新增 Migration 腳本範例:
-- apps/kyo-otp-service/src/db/migrations/001_create_users_table.sql
CREATE TABLE IF NOT EXISTS users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
email VARCHAR(255) UNIQUE NOT NULL,
password_hash VARCHAR(255) NOT NULL,
phone_number VARCHAR(20),
phone_verified BOOLEAN DEFAULT FALSE,
status VARCHAR(20) DEFAULT 'pending',
tenant_id UUID NOT NULL,
role VARCHAR(50) DEFAULT 'user',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_users_tenant_id ON users(tenant_id);
CREATE INDEX idx_users_status ON users(status);
-- apps/kyo-otp-service/src/db/migrations/002_create_otp_logs_table.sql
CREATE TABLE IF NOT EXISTS otp_logs (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
request_id VARCHAR(255) UNIQUE NOT NULL,
user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
phone_number VARCHAR(20) NOT NULL,
otp_code VARCHAR(10) NOT NULL,
status VARCHAR(20) DEFAULT 'pending',
attempts INTEGER DEFAULT 0,
expires_at TIMESTAMP NOT NULL,
verified_at TIMESTAMP,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX idx_otp_logs_request_id ON otp_logs(request_id);
CREATE INDEX idx_otp_logs_user_id ON otp_logs(user_id);
CREATE INDEX idx_otp_logs_phone_number ON otp_logs(phone_number);
CREATE INDEX idx_otp_logs_status ON otp_logs(status);
CREATE INDEX idx_otp_logs_expires_at ON otp_logs(expires_at);
建立部署後的煙霧測試:
// test/smoke/smoke.test.ts
import axios from 'axios';
import { test } from 'node:test';
import assert from 'node:assert';
const API_URL = process.env.API_URL || 'http://localhost:3000';
const TIMEOUT = 10000;
test('Smoke Tests', { timeout: 30000 }, async (t) => {
await t.test('Health check should return 200', async () => {
const response = await axios.get(`${API_URL}/health`, { timeout: TIMEOUT });
assert.strictEqual(response.status, 200);
assert.ok(response.data.status);
});
await t.test('API should be accessible', async () => {
const response = await axios.get(`${API_URL}/api`, { timeout: TIMEOUT });
assert.strictEqual(response.status, 200);
});
await t.test('Database connection should be healthy', async () => {
const response = await axios.get(`${API_URL}/health/db`, { timeout: TIMEOUT });
assert.strictEqual(response.status, 200);
assert.strictEqual(response.data.database, 'connected');
});
await t.test('Redis connection should be healthy', async () => {
const response = await axios.get(`${API_URL}/health/redis`, { timeout: TIMEOUT });
assert.strictEqual(response.status, 200);
assert.strictEqual(response.data.redis, 'connected');
});
await t.test('Can login with valid credentials', async () => {
const response = await axios.post(
`${API_URL}/api/auth/login`,
{
email: process.env.TEST_USER_EMAIL || 'test@example.com',
password: process.env.TEST_USER_PASSWORD || 'test123',
},
{ timeout: TIMEOUT }
);
assert.strictEqual(response.status, 200);
assert.ok(response.data.token);
});
await t.test('Rate limiting is working', async () => {
const token = 'test-token';
// Make multiple requests rapidly
const requests = Array(10)
.fill(null)
.map(() =>
axios.post(
`${API_URL}/api/otp/send`,
{ phoneNumber: '0912345678' },
{
headers: { Authorization: `Bearer ${token}` },
timeout: TIMEOUT,
validateStatus: () => true, // Don't throw on any status
}
)
);
const responses = await Promise.all(requests);
const rateLimited = responses.some((r) => r.status === 429);
assert.ok(rateLimited, 'Rate limiting should trigger');
});
await t.test('Metrics endpoint is accessible', async () => {
const response = await axios.get(`${API_URL}/metrics`, { timeout: TIMEOUT });
assert.strictEqual(response.status, 200);
assert.ok(response.data.includes('http_requests_total'));
});
});
## AWS 生產部署檢查清單
### Infrastructure (CDK)
- [ ] CDK 程式碼已通過 synth
- [ ] 所有 Stacks 已部署成功
- [ ] VPC 與 Subnet 設定正確
- [ ] Security Groups 規則最小化
- [ ] NAT Gateway 已設定 (Multi-AZ)
- [ ] Load Balancer 健康檢查正常
### ECS/Fargate
- [ ] Task Definition 資源配置適當
- [ ] Container health check 設定
- [ ] Auto Scaling 策略已測試
- [ ] Service Discovery 正常運作
- [ ] 藍綠部署設定正確
- [ ] Deployment Circuit Breaker 啟用
### Database (RDS)
- [ ] Multi-AZ 部署已啟用
- [ ] 自動備份已設定 (7-35天)
- [ ] 加密已啟用 (at-rest, in-transit)
- [ ] Parameter Group 優化
- [ ] 效能監控已設定
- [ ] Migration 腳本已測試
- [ ] Rollback 策略已準備
### Cache (ElastiCache Redis)
- [ ] Cluster mode 已啟用 (production)
- [ ] Multi-AZ 複寫已設定
- [ ] 自動備份已設定
- [ ] 記憶體閾值告警已設定
- [ ] 連線池設定正確
### Secrets Management
- [ ] Secrets Manager 中存放所有敏感資料
- [ ] IAM 權限正確設定
- [ ] 金鑰輪替策略已設定
- [ ] 無硬編碼的 secrets
### CI/CD Pipeline
- [ ] GitHub Actions workflow 測試通過
- [ ] Docker 映像建置成功
- [ ] ECR 映像推送正常
- [ ] 自動化測試全部通過
- [ ] 部署腳本已驗證
- [ ] Rollback 機制已測試
### Monitoring & Logging
- [ ] CloudWatch Logs 收集正常
- [ ] Metrics Dashboard 已建立
- [ ] 告警規則已設定
- [ ] SNS 通知已測試
- [ ] X-Ray 追蹤已啟用
- [ ] Log retention 已設定
### Security
- [ ] WAF 規則已設定
- [ ] SSL/TLS 憑證已安裝
- [ ] Security Groups 最小化
- [ ] IAM Roles 遵循最小權限原則
- [ ] VPC Flow Logs 已啟用
- [ ] GuardDuty 已啟用
### Cost Optimization
- [ ] 資源 tagging 完整
- [ ] Cost Explorer 已設定
- [ ] Budget alerts 已設定
- [ ] Spot instances 考慮使用 (non-prod)
- [ ] S3 lifecycle policies 設定
### Performance
- [ ] Load testing 完成 (k6)
- [ ] API 回應時間 < 500ms (p95)
- [ ] Database queries 優化
- [ ] CDN 設定 (CloudFront)
- [ ] Gzip/Brotli 壓縮啟用
### Documentation
- [ ] Architecture diagram 更新
- [ ] Runbook 完整
- [ ] API documentation 更新
- [ ] Deployment guide 完成
- [ ] Troubleshooting guide 準備
### Post-Deployment
- [ ] Smoke tests 通過
- [ ] Health checks 正常
- [ ] Metrics 收集正常
- [ ] Logs 可查詢
- [ ] Alarms 沒有誤報
- [ ] 團隊通知已發送
今天我們完成了 Kyo System 在 AWS 上的完整 CI/CD Pipeline:
完整 GitHub Actions Workflow
優化的 Docker 建置
ECS 藍綠部署
自動化資料庫 Migration
生產就緒檢查