iT邦幫忙

2025 iThome 鐵人賽

DAY 9
0
Build on AWS

30 天將工作室 SaaS 產品部署起來系列 第 9

Day 9: 30天部署SaaS產品到AWS: 應用程式效能監控 (APM) 與可觀測性

  • 分享至 

  • xImage
  •  

前情提要

在 Day 8 我們完成了系統安全防護與監控基礎建設,今天我們要建立應用程式效能監控 (APM)。基於前面建立的安全監控,我們現在要實現全方位的可觀測性,包括應用程式效能、錯誤追蹤、使用者體驗監控等。

透過完整的 APM 系統,我們能即時發現效能瓶頸、追蹤錯誤原因,並基於真實數據做出最佳決策。

APM 在 OTP 系統中的關鍵價值

為什麼 OTP 系統需要深度監控?

OTP 系統是關鍵業務流程,任何延遲或錯誤都會直接影響使用者體驗:

  1. 即時性要求:OTP 發送延遲超過 30 秒就會影響使用者體驗
  2. 可靠性關鍵:99.9% 可用性是基本要求
  3. 效能瓶頸識別:Redis、Database、SMS API 的效能監控
  4. 錯誤原因分析:快速定位問題並修復
  5. 業務指標追蹤:發送成功率、驗證成功率等關鍵指標

監控架構演進

Day 8 架構(基礎監控):

CloudTrail → CloudWatch → SNS → 通知
                ↑
        基礎設施層監控

Day 9 架構(完整 APM):

  應用程式
      ↓
  X-Ray Tracing → CloudWatch Insights
      ↓              ↓
  Custom Metrics → Dashboard
      ↓              ↓
  Real User → Business Intelligence
  Monitoring

X-Ray 分散式追蹤設定

1. 啟用 X-Ray 追蹤

# infrastructure/xray-monitoring.yml
AWSTemplateFormatVersion: '2010-09-09'
Description: 'X-Ray Tracing for Kyo OTP System'

Parameters:
  Environment:
    Type: String
    Default: prod
    AllowedValues: [dev, staging, prod]

  ApplicationName:
    Type: String
    Description: Application name for tracing
    Default: kyo-otp-system

Resources:
  # X-Ray Service Map
  XRayServiceMap:
    Type: AWS::XRay::Group
    Properties:
      GroupName: !Sub '${ApplicationName}-${Environment}'
      FilterExpression: 'service("kyo-otp-service") OR service("kyo-dashboard")'
      InsightsConfiguration:
        InsightsEnabled: true
        NotificationsEnabled: true

  # X-Ray Sampling Rule for OTP operations
  OtpSamplingRule:
    Type: AWS::XRay::SamplingRule
    Properties:
      SamplingRule:
        RuleName: !Sub 'kyo-otp-${Environment}'
        Priority: 1000
        FixedRate: 0.1  # 10% sampling
        ReservoirSize: 1
        ServiceName: 'kyo-otp-service'
        ServiceType: '*'
        Host: '*'
        HTTPMethod: 'POST'
        URLPath: '/api/otp/*'
        Version: 1

  # CloudWatch Dashboard for Application Metrics
  ApplicationDashboard:
    Type: AWS::CloudWatch::Dashboard
    Properties:
      DashboardName: !Sub '${ApplicationName}-APM-${Environment}'
      DashboardBody: !Sub |
        {
          "widgets": [
            {
              "type": "metric",
              "x": 0, "y": 0, "width": 12, "height": 6,
              "properties": {
                "metrics": [
                  ["AWS/X-Ray", "ResponseTime", "ServiceName", "kyo-otp-service"],
                  [".", "ErrorRate", ".", "."],
                  [".", "ThrottleRate", ".", "."]
                ],
                "period": 300,
                "stat": "Average",
                "region": "${AWS::Region}",
                "title": "OTP Service Performance"
              }
            }
          ]
        }

        # Cache Behaviors
        CacheBehaviors:
          # API requests - minimal caching
          - PathPattern: '/api/*'
            TargetOriginId: api-origin
            ViewerProtocolPolicy: redirect-to-https
            CachePolicyId: 4135ea2d-6df8-44a3-9df3-4b5a84be39ad  # CachingDisabled
            OriginRequestPolicyId: 88a5eaf4-2fd4-4709-b370-b4c650ea3fcf  # CORS-S3Origin
            Compress: true

          # Templates API - short cache
          - PathPattern: '/api/templates'
            TargetOriginId: api-origin
            ViewerProtocolPolicy: redirect-to-https
            CachePolicyId: 658327ea-f89d-4fab-a63d-7e88639e58f6  # CachingOptimized
            OriginRequestPolicyId: 2e54312d-136d-493c-8eb9-b001f22f67d2  # ElementalMediaTailorPersonalizedManifests
            Compress: true
            TTL:
              DefaultTTL: 300    # 5 minutes
              MaxTTL: 3600       # 1 hour

        # Default Cache Behavior (for static assets)
        DefaultCacheBehavior:
          TargetOriginId: assets-origin
          ViewerProtocolPolicy: redirect-to-https
          CachePolicyId: 658327ea-f89d-4fab-a63d-7e88639e58f6  # CachingOptimized
          Compress: true

        # Custom Error Pages
        CustomErrorResponses:
          - ErrorCode: 404
            ResponseCode: 200
            ResponsePagePath: '/index.html'
            ErrorCachingMinTTL: 300
          - ErrorCode: 403
            ResponseCode: 200
            ResponsePagePath: '/index.html'
            ErrorCachingMinTTL: 300

        # Logging
        Logging:
          Bucket: !Sub '${LoggingBucket}.s3.amazonaws.com'
          Prefix: 'cloudfront-logs/'
          IncludeCookies: false

        # Security headers
        ResponseHeadersPolicy:
          ResponseHeadersPolicyConfig:
            Name: !Sub 'kyo-security-headers-${Environment}'
            SecurityHeadersConfig:
              StrictTransportSecurity:
                AccessControlMaxAgeSec: 31536000
                IncludeSubdomains: true
              ContentTypeOptions:
                Override: true
              FrameOptions:
                FrameOption: DENY
                Override: true
              ReferrerPolicy:
                ReferrerPolicy: strict-origin-when-cross-origin
                Override: true

Outputs:
  DistributionId:
    Description: CloudFront Distribution ID
    Value: !Ref CloudFrontDistribution
    Export:
      Name: !Sub '${AWS::StackName}-DistributionId'

  DistributionDomainName:
    Description: CloudFront Distribution Domain Name
    Value: !GetAtt CloudFrontDistribution.DomainName
    Export:
      Name: !Sub '${AWS::StackName}-DomainName'

2. S3 靜態資源儲存桶

# infrastructure/s3-assets.yml
Resources:
  S3AssetsBucket:
    Type: AWS::S3::Bucket
    Properties:
      BucketName: !Sub 'kyo-assets-${Environment}-${AWS::AccountId}'
      PublicAccessBlockConfiguration:
        BlockPublicAcls: true
        BlockPublicPolicy: true
        IgnorePublicAcls: true
        RestrictPublicBuckets: true
      VersioningConfiguration:
        Status: Enabled
      BucketEncryption:
        ServerSideEncryptionConfiguration:
          - ServerSideEncryptionByDefault:
              SSEAlgorithm: AES256

  # CloudFront 存取政策
  S3BucketPolicy:
    Type: AWS::S3::BucketPolicy
    Properties:
      Bucket: !Ref S3AssetsBucket
      PolicyDocument:
        Statement:
          - Effect: Allow
            Principal:
              AWS: !Sub 'arn:aws:iam::cloudfront:user/CloudFront Origin Access Identity ${OriginAccessControl}'
            Action: 's3:GetObject'
            Resource: !Sub '${S3AssetsBucket}/*'

快取策略優化

1. 階層式快取設計

// lambda/cache-strategy.ts
export interface CacheConfig {
  path: string;
  ttl: number;
  cacheKey: string[];
  varyHeaders?: string[];
}

export const cacheStrategies: CacheConfig[] = [
  // 靜態資源 - 長期快取
  {
    path: '/static/*',
    ttl: 31536000, // 1 year
    cacheKey: ['$request_uri'],
    varyHeaders: ['Accept-Encoding']
  },

  // API 模板 - 中期快取
  {
    path: '/api/templates',
    ttl: 300, // 5 minutes
    cacheKey: ['$request_uri', '$http_authorization'],
    varyHeaders: ['Authorization', 'Accept']
  },

  // OTP 發送 - 不快取
  {
    path: '/api/otp/send',
    ttl: 0,
    cacheKey: [],
    varyHeaders: ['*']
  }
];

// CloudFront Functions for dynamic caching
export const dynamicCacheFunction = `
function handler(event) {
    var request = event.request;
    var uri = request.uri;

    // Add cache headers based on path
    if (uri.startsWith('/api/templates')) {
        request.headers['cache-control'] = {
            value: 'public, max-age=300, s-maxage=300'
        };
    } else if (uri.startsWith('/api/otp')) {
        request.headers['cache-control'] = {
            value: 'no-cache, no-store, must-revalidate'
        };
    }

    return request;
}
`;

2. 智慧快取失效

// lambda/cache-invalidation.ts
import { CloudFrontClient, CreateInvalidationCommand } from '@aws-sdk/client-cloudfront';

export class CloudFrontCacheManager {
  private cloudfront: CloudFrontClient;
  private distributionId: string;

  constructor(distributionId: string) {
    this.cloudfront = new CloudFrontClient({ region: 'us-east-1' });
    this.distributionId = distributionId;
  }

  async invalidateTemplateCache(templateId?: string): Promise<void> {
    const paths = templateId
      ? [`/api/templates/${templateId}`, '/api/templates']
      : ['/api/templates/*'];

    await this.createInvalidation(paths);
  }

  async invalidateAssets(paths: string[]): Promise<void> {
    await this.createInvalidation(paths);
  }

  private async createInvalidation(paths: string[]): Promise<void> {
    const command = new CreateInvalidationCommand({
      DistributionId: this.distributionId,
      InvalidationBatch: {
        Paths: {
          Quantity: paths.length,
          Items: paths
        },
        CallerReference: `invalidation-${Date.now()}`
      }
    });

    try {
      const result = await this.cloudfront.send(command);
      console.log('Cache invalidation created:', result.Invalidation?.Id);
    } catch (error) {
      console.error('Failed to create invalidation:', error);
      throw error;
    }
  }
}

// 使用範例:在模板更新後自動清除快取
export const updateTemplate = async (templateId: string, data: any) => {
  // 更新資料庫
  await templateService.update(templateId, data);

  // 清除相關快取
  const cacheManager = new CloudFrontCacheManager(process.env.CLOUDFRONT_DISTRIBUTION_ID!);
  await cacheManager.invalidateTemplateCache(templateId);
};

效能監控與分析

1. CloudWatch Metrics 設定

# monitoring/cloudfront-monitoring.yml
Resources:
  CloudFrontDashboard:
    Type: AWS::CloudWatch::Dashboard
    Properties:
      DashboardName: !Sub 'KyoOTP-CloudFront-${Environment}'
      DashboardBody: !Sub |
        {
          "widgets": [
            {
              "type": "metric",
              "x": 0, "y": 0,
              "width": 12, "height": 6,
              "properties": {
                "metrics": [
                  ["AWS/CloudFront", "Requests", "DistributionId", "${CloudFrontDistribution}"],
                  [".", "BytesDownloaded", ".", "."],
                  [".", "BytesUploaded", ".", "."]
                ],
                "period": 300,
                "stat": "Sum",
                "region": "us-east-1",
                "title": "CloudFront Traffic"
              }
            },
            {
              "type": "metric",
              "x": 0, "y": 6,
              "width": 12, "height": 6,
              "properties": {
                "metrics": [
                  ["AWS/CloudFront", "CacheHitRate", "DistributionId", "${CloudFrontDistribution}"],
                  [".", "OriginLatency", ".", "."]
                ],
                "period": 300,
                "stat": "Average",
                "region": "us-east-1",
                "title": "Cache Performance"
              }
            }
          ]
        }

  # Cache Hit Rate Alarm
  LowCacheHitRateAlarm:
    Type: AWS::CloudWatch::Alarm
    Properties:
      AlarmName: !Sub 'KyoOTP-LowCacheHitRate-${Environment}'
      AlarmDescription: 'CloudFront cache hit rate is low'
      MetricName: CacheHitRate
      Namespace: AWS/CloudFront
      Statistic: Average
      Period: 300
      EvaluationPeriods: 3
      Threshold: 80
      ComparisonOperator: LessThanThreshold
      Dimensions:
        - Name: DistributionId
          Value: !Ref CloudFrontDistribution
      AlarmActions:
        - !Ref SNSAlarmTopic

2. 真實用戶監控 (RUM)

// frontend/src/monitoring/rum.ts
export class CloudFrontRUM {
  private metrics: Array<{
    name: string;
    value: number;
    timestamp: number;
    tags: Record<string, string>;
  }> = [];

  trackPageLoad(): void {
    const navigation = performance.getEntriesByType('navigation')[0] as PerformanceNavigationTiming;

    this.addMetric('page_load_time', navigation.loadEventEnd - navigation.fetchStart, {
      page: window.location.pathname,
      cache_status: this.getCacheStatus(),
      edge_location: this.getEdgeLocation()
    });
  }

  trackAPICall(endpoint: string, duration: number, success: boolean): void {
    this.addMetric('api_call_duration', duration, {
      endpoint,
      success: success.toString(),
      cache_status: this.getCacheStatus()
    });
  }

  private getCacheStatus(): string {
    // Check CloudFront cache headers
    const cacheStatus = document.querySelector('meta[name="cf-cache-status"]')?.getAttribute('content');
    return cacheStatus || 'unknown';
  }

  private getEdgeLocation(): string {
    const edgeLocation = document.querySelector('meta[name="cf-ray"]')?.getAttribute('content');
    return edgeLocation?.split('-')[1] || 'unknown';
  }

  private addMetric(name: string, value: number, tags: Record<string, string>): void {
    this.metrics.push({
      name,
      value,
      timestamp: Date.now(),
      tags
    });

    // Send to CloudWatch if batch is full
    if (this.metrics.length >= 10) {
      this.sendMetrics();
    }
  }

  private async sendMetrics(): Promise<void> {
    try {
      await fetch('/api/metrics', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ metrics: this.metrics })
      });
      this.metrics = [];
    } catch (error) {
      console.error('Failed to send metrics:', error);
    }
  }
}

// 在 App.tsx 中初始化
const rum = new CloudFrontRUM();
rum.trackPageLoad();

部署與測試

1. 部署腳本

#!/bin/bash
# scripts/deploy-cloudfront.sh

set -e

ENVIRONMENT=${1:-dev}
REGION=${2:-us-east-1}

echo "Deploying CloudFront for environment: $ENVIRONMENT"

# Deploy S3 assets bucket
aws cloudformation deploy \
  --template-file infrastructure/s3-assets.yml \
  --stack-name kyo-s3-assets-$ENVIRONMENT \
  --parameter-overrides Environment=$ENVIRONMENT \
  --region $REGION

# Get ALB domain name
ALB_DOMAIN=$(aws cloudformation describe-stacks \
  --stack-name kyo-infrastructure-$ENVIRONMENT \
  --query 'Stacks[0].Outputs[?OutputKey==`LoadBalancerDNS`].OutputValue' \
  --output text \
  --region $REGION)

# Deploy CloudFront
aws cloudformation deploy \
  --template-file infrastructure/cloudfront.yml \
  --stack-name kyo-cloudfront-$ENVIRONMENT \
  --parameter-overrides \
    Environment=$ENVIRONMENT \
    OriginDomainName=$ALB_DOMAIN \
  --capabilities CAPABILITY_IAM \
  --region $REGION

# Get CloudFront domain
CLOUDFRONT_DOMAIN=$(aws cloudformation describe-stacks \
  --stack-name kyo-cloudfront-$ENVIRONMENT \
  --query 'Stacks[0].Outputs[?OutputKey==`DistributionDomainName`].OutputValue' \
  --output text \
  --region $REGION)

echo "CloudFront deployed successfully!"
echo "Domain: https://$CLOUDFRONT_DOMAIN"

# Upload static assets
echo "Uploading static assets..."
aws s3 sync ./dist s3://kyo-assets-$ENVIRONMENT-$(aws sts get-caller-identity --query Account --output text)/ \
  --delete \
  --cache-control "public, max-age=31536000"

echo "Deployment complete!"

2. 效能測試

// tests/performance/cloudfront.test.ts
import { expect } from '@jest/globals';

describe('CloudFront Performance Tests', () => {
  const cloudfrontDomain = process.env.CLOUDFRONT_DOMAIN;
  const originDomain = process.env.ORIGIN_DOMAIN;

  test('CDN should reduce response time', async () => {
    // Test direct origin
    const originStart = Date.now();
    await fetch(`https://${originDomain}/api/templates`);
    const originTime = Date.now() - originStart;

    // Test via CloudFront
    const cdnStart = Date.now();
    await fetch(`https://${cloudfrontDomain}/api/templates`);
    const cdnTime = Date.now() - cdnStart;

    // Wait and test cache hit
    await new Promise(resolve => setTimeout(resolve, 1000));
    const cacheStart = Date.now();
    const cacheResponse = await fetch(`https://${cloudfrontDomain}/api/templates`);
    const cacheTime = Date.now() - cacheStart;

    // Verify cache headers
    expect(cacheResponse.headers.get('x-cache')).toContain('Hit');
    expect(cacheTime).toBeLessThan(originTime * 0.5); // Should be 50% faster
  });

  test('Static assets should have long cache TTL', async () => {
    const response = await fetch(`https://${cloudfrontDomain}/static/js/main.js`);

    expect(response.headers.get('cache-control')).toContain('max-age=31536000');
    expect(response.headers.get('x-cache')).toBeDefined();
  });

  test('API responses should have appropriate cache headers', async () => {
    const templatesResponse = await fetch(`https://${cloudfrontDomain}/api/templates`);
    expect(templatesResponse.headers.get('cache-control')).toContain('max-age=300');

    const otpResponse = await fetch(`https://${cloudfrontDomain}/api/otp/send`, {
      method: 'POST',
      body: JSON.stringify({ phone: '0987654321' })
    });
    expect(otpResponse.headers.get('cache-control')).toContain('no-cache');
  });
});

成本與效益分析

CloudFront 使用成本

月流量假設:10GB,100萬請求

項目 成本 (USD)
資料傳輸 (前 10TB) $0.85
HTTP/HTTPS 請求 (前 1000萬) $0.75
總計 $1.60/月

效能提升指標

預期改善:

  • 全球平均延遲:降低 40-60%
  • 伺服器負載:降低 70%(靜態資源)
  • 可用性:提升至 99.9%+
  • 頻寬成本:降低 50%

本日成果:
✅ CloudFront CDN 部署與設定
✅ 階層式快取策略設計
✅ 智慧快取失效機制
✅ 效能監控與 RUM 整合
✅ 成本效益分析


上一篇
Day 8: 30天部署SaaS產品到AWS-安全性強化與監控
系列文
30 天將工作室 SaaS 產品部署起來9
圖片
  熱門推薦
圖片
{{ item.channelVendor }} | {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言