今天我們要將 AI 助理部署到雲端!學習如何讓你的助理在網路上運行,讓更多人能夠使用。我們將探索容器化、雲端平台選擇、以及實際的部署流程。
將 AI 助理部署到雲端帶來諸多優勢:
最靈活且可攜的方案,適合各種雲端平台。
適合輕量級應用,自動擴展,成本最低。
傳統但穩定,適合需要完整控制的場景。
今天我們聚焦於 Docker 容器化 + 雲端平台部署。
ai_assistant_deploy/
├── app/
│ ├── __init__.py
│ ├── main.py # FastAPI 應用主程式
│ ├── assistant.py # AI 助理核心
│ └── config.py # 設定管理
├── requirements.txt # Python 依賴
├── Dockerfile # Docker 映像檔
├── docker-compose.yml # Docker Compose 設定
├── .env.example # 環境變數範例
├── .dockerignore # Docker 忽略檔案
└── nginx/
└── nginx.conf # Nginx 設定
from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
import google.generativeai as genai
import os
from typing import Optional, List
import logging
# 設定日誌
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
# 初始化 FastAPI
app = FastAPI(
title="AI Assistant API",
description="智能 AI 助理 API 服務",
version="1.0.0"
)
# CORS 設定
app.add_middleware(
CORSMiddleware,
allow_origins=["*"], # 生產環境應限制來源
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# 初始化 Gemini
genai.configure(api_key=os.getenv('GEMINI_API_KEY'))
class ChatRequest(BaseModel):
message: str
session_id: Optional[str] = None
context: Optional[dict] = None
class ChatResponse(BaseModel):
response: str
session_id: str
timestamp: str
class HealthResponse(BaseModel):
status: str
version: str
model: str
# 儲存會話(簡化版,生產環境應使用 Redis)
sessions = {}
@app.get("/", response_model=dict)
async def root():
"""根端點"""
return {
"message": "AI Assistant API is running",
"docs": "/docs",
"health": "/health"
}
@app.get("/health", response_model=HealthResponse)
async def health_check():
"""健康檢查端點"""
return HealthResponse(
status="healthy",
version="1.0.0",
model="gemini-2.0-flash-exp"
)
@app.post("/chat", response_model=ChatResponse)
async def chat(request: ChatRequest):
"""聊天端點"""
try:
# 生成或使用現有 session_id
session_id = request.session_id or str(uuid.uuid4())
# 獲取對話歷史
history = sessions.get(session_id, [])
# 構建提示
context_str = ""
if request.context:
context_str = f"\n背景資訊:{json.dumps(request.context, ensure_ascii=False)}"
prompt = f"{context_str}\n使用者:{request.message}"
# 呼叫 Gemini
model = genai.GenerativeModel('gemini-2.0-flash-exp')
response = model.generate_content(prompt)
# 更新對話歷史
history.append({
'user': request.message,
'assistant': response.text,
'timestamp': datetime.now().isoformat()
})
sessions[session_id] = history[-10:] # 保留最近 10 輪
logger.info(f"Chat request processed for session: {session_id}")
return ChatResponse(
response=response.text,
session_id=session_id,
timestamp=datetime.now().isoformat()
)
except Exception as e:
logger.error(f"Chat error: {str(e)}")
raise HTTPException(status_code=500, detail=str(e))
@app.post("/analyze")
async def analyze_text(text: str):
"""文字分析端點"""
try:
model = genai.GenerativeModel('gemini-2.0-flash-exp')
prompt = f"""
請分析以下文字:
{text}
提供:
1. 主題摘要
2. 關鍵詞提取
3. 情感分析
"""
response = model.generate_content(prompt)
return {
"analysis": response.text,
"timestamp": datetime.now().isoformat()
}
except Exception as e:
logger.error(f"Analysis error: {str(e)}")
raise HTTPException(status_code=500, detail=str(e))
@app.get("/sessions/{session_id}")
async def get_session(session_id: str):
"""獲取會話歷史"""
if session_id not in sessions:
raise HTTPException(status_code=404, detail="Session not found")
return {
"session_id": session_id,
"history": sessions[session_id]
}
if __name__ == "__main__":
import uvicorn
from datetime import datetime
import json
import uuid
uvicorn.run(app, host="0.0.0.0", port=8000)
# 使用官方 Python 映像
FROM python:3.11-slim
# 設定工作目錄
WORKDIR /app
# 安裝系統依賴
RUN apt-get update && apt-get install -y \
gcc \
&& rm -rf /var/lib/apt/lists/*
# 複製需求檔案
COPY requirements.txt .
# 安裝 Python 依賴
RUN pip install --no-cache-dir -r requirements.txt
# 複製應用程式碼
COPY ./app /app
# 建立非 root 使用者
RUN useradd -m -u 1000 appuser && chown -R appuser:appuser /app
USER appuser
# 暴露端口
EXPOSE 8000
# 健康檢查
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD python -c "import requests; requests.get('http://localhost:8000/health')"
# 啟動命令
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
fastapi==0.104.1
uvicorn[standard]==0.24.0
google-generativeai==0.3.1
pydantic==2.5.0
python-dotenv==1.0.0
requests==2.31.0
version: '3.8'
services:
ai-assistant:
build: .
ports:
- "8000:8000"
environment:
- GEMINI_API_KEY=${GEMINI_API_KEY}
- LOG_LEVEL=info
volumes:
- ./app:/app
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
nginx:
image: nginx:alpine
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
depends_on:
- ai-assistant
restart: unless-stopped
volumes:
app-data:
events {
worker_connections 1024;
}
http {
upstream ai_assistant {
server ai-assistant:8000;
}
server {
listen 80;
server_name _;
# 請求大小限制
client_max_body_size 10M;
# 日誌
access_log /var/log/nginx/access.log;
error_log /var/log/nginx/error.log;
# API 端點
location / {
proxy_pass http://ai_assistant;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# 逾時設定
proxy_connect_timeout 60s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
}
# 健康檢查端點
location /health {
proxy_pass http://ai_assistant/health;
access_log off;
}
}
}
優點:
部署步驟:
# 1. 安裝 Google Cloud SDK
# 參考:https://cloud.google.com/sdk/docs/install
# 2. 登入並設定專案
gcloud auth login
gcloud config set project YOUR_PROJECT_ID
# 3. 啟用必要的 API
gcloud services enable run.googleapis.com
gcloud services enable cloudbuild.googleapis.com
# 4. 建置並推送映像到 Container Registry
gcloud builds submit --tag gcr.io/YOUR_PROJECT_ID/ai-assistant
# 5. 部署到 Cloud Run
gcloud run deploy ai-assistant \
--image gcr.io/YOUR_PROJECT_ID/ai-assistant \
--platform managed \
--region asia-east1 \
--allow-unauthenticated \
--set-env-vars GEMINI_API_KEY=your_api_key_here \
--memory 512Mi \
--cpu 1 \
--max-instances 10 \
--min-instances 0
# 6. 獲取服務 URL
gcloud run services describe ai-assistant \
--platform managed \
--region asia-east1 \
--format 'value(status.url)'
優點:
部署步驟:
# 1. 安裝 AWS CLI
# 參考:https://aws.amazon.com/cli/
# 2. 設定憑證
aws configure
# 3. 建立 ECR 儲存庫
aws ecr create-repository --repository-name ai-assistant
# 4. 登入 ECR
aws ecr get-login-password --region ap-northeast-1 | \
docker login --username AWS --password-stdin \
YOUR_ACCOUNT_ID.dkr.ecr.ap-northeast-1.amazonaws.com
# 5. 標記並推送映像
docker tag ai-assistant:latest \
YOUR_ACCOUNT_ID.dkr.ecr.ap-northeast-1.amazonaws.com/ai-assistant:latest
docker push \
YOUR_ACCOUNT_ID.dkr.ecr.ap-northeast-1.amazonaws.com/ai-assistant:latest
# 6. 使用 ECS 或 App Runner 部署
# 透過 AWS Console 或 Infrastructure as Code (Terraform)
優點:
部署步驟:
# 1. 安裝 Heroku CLI
# 參考:https://devcenter.heroku.com/articles/heroku-cli
# 2. 登入
heroku login
# 3. 建立應用
heroku create your-ai-assistant
# 4. 設定環境變數
heroku config:set GEMINI_API_KEY=your_api_key_here
# 5. 部署
git push heroku main
# 6. 開啟應用
heroku open
優點:
部署步驟:
# 1. 安裝 Railway CLI
npm i -g @railway/cli
# 2. 登入
railway login
# 3. 初始化專案
railway init
# 4. 連結專案
railway link
# 5. 設定環境變數
railway variables set GEMINI_API_KEY=your_api_key_here
# 6. 部署
railway up
在部署前,先在本地測試:
# 1. 建立 .env 檔案
cat > .env << EOF
GEMINI_API_KEY=your_api_key_here
LOG_LEVEL=info
EOF
# 2. 使用 Docker Compose 啟動
docker-compose up --build
# 3. 測試 API
curl http://localhost/health
# 4. 測試聊天端點
curl -X POST http://localhost/chat \
-H "Content-Type: application/json" \
-d '{"message": "你好!"}'
# 5. 查看日誌
docker-compose logs -f ai-assistant
# app/monitoring.py
from prometheus_client import Counter, Histogram, generate_latest
from fastapi import Request
import time
# 定義指標
request_count = Counter(
'http_requests_total',
'Total HTTP requests',
['method', 'endpoint', 'status']
)
request_duration = Histogram(
'http_request_duration_seconds',
'HTTP request duration',
['method', 'endpoint']
)
@app.middleware("http")
async def monitor_requests(request: Request, call_next):
"""監控中介軟體"""
start_time = time.time()
response = await call_next(request)
duration = time.time() - start_time
# 記錄指標
request_count.labels(
method=request.method,
endpoint=request.url.path,
status=response.status_code
).inc()
request_duration.labels(
method=request.method,
endpoint=request.url.path
).observe(duration)
return response
@app.get("/metrics")
async def metrics():
"""Prometheus 指標端點"""
return Response(
content=generate_latest(),
media_type="text/plain"
)
# app/config.py
from pydantic_settings import BaseSettings
from typing import Optional
class Settings(BaseSettings):
"""應用設定"""
# API 金鑰
gemini_api_key: str
# 應用設定
app_name: str = "AI Assistant"
debug: bool = False
log_level: str = "INFO"
# 安全設定
allowed_origins: list = ["*"]
rate_limit: int = 100 # 每分鐘請求數
# 資料庫(選用)
database_url: Optional[str] = None
class Config:
env_file = ".env"
case_sensitive = False
settings = Settings()
from fastapi import Security, HTTPException
from fastapi.security import APIKeyHeader
API_KEY_HEADER = APIKeyHeader(name="X-API-Key")
async def verify_api_key(api_key: str = Security(API_KEY_HEADER)):
"""驗證 API 金鑰"""
if api_key != settings.api_key:
raise HTTPException(
status_code=403,
detail="Invalid API key"
)
return api_key
# 在端點使用
@app.post("/chat", dependencies=[Depends(verify_api_key)])
async def chat(request: ChatRequest):
# ...
免費額度:
- 每月 200 萬次請求
- 360,000 GB-秒記憶體
- 180,000 vCPU-秒
預估成本(超過免費額度後):
- 小型應用(1000 req/day): ~$0-5/月
- 中型應用(10000 req/day): ~$10-30/月
- 大型應用(100000 req/day): ~$100-300/月
預估成本:
- Fargate: ~$30-100/月(持續運行)
- EC2: ~$10-50/月(按需)
免費方案:可用但有限制
Hobby 方案:$7/月
Professional:$25-50/月
部署前確認:
# 完整部署流程(以 Google Cloud Run 為例)
# 1. 複製專案
git clone your-repo
cd ai_assistant_deploy
# 2. 設定環境變數
cp .env.example .env
# 編輯 .env,填入你的 API 金鑰
# 3. 本地測試
docker-compose up --build
# 4. 測試 API(新終端)
curl http://localhost/health
curl -X POST http://localhost/chat \
-H "Content-Type: application/json" \
-d '{"message": "Hello!"}'
# 5. 部署到雲端
gcloud builds submit --tag gcr.io/YOUR_PROJECT/ai-assistant
gcloud run deploy ai-assistant \
--image gcr.io/YOUR_PROJECT/ai-assistant \
--platform managed \
--region asia-east1 \
--allow-unauthenticated \
--set-env-vars GEMINI_API_KEY=your_key
# 6. 獲取 URL 並測試
SERVICE_URL=$(gcloud run services describe ai-assistant \
--format 'value(status.url)')
curl $SERVICE_URL/health
今天我們學習了完整的雲端部署流程:
✅ 容器化:使用 Docker 打包應用
✅ FastAPI 服務:建立 REST API 端點
✅ 多平台部署:Cloud Run、AWS、Heroku、Railway
✅ 監控日誌:Prometheus 指標和日誌管理
✅ 安全設定:環境變數、API 金鑰保護
明天我們將學習效能監控與日誌分析,確保系統穩定運行!
常見問題:
容器建置失敗
部署後無法存取
API 金鑰錯誤
記憶體不足
你的 AI 助理現在已經可以在雲端運行了!🎉