DAY 49 Amazon SageMaker

2025 iThome 鐵人賽

自我挑戰組

17th鐵人賽

142 瀏覽

Amazon SageMaker AI

End-to-End ML Service - 收集與準備資料、訓練模型、部署並監測模型預測性能、根據模型表現改進資料

內建演算法

Automatic Model Tuning(AMT)

deploy和inference

self-hosted:只需一鍵部署，部署完成後SageMaker會auto-scaling(不需要自行管理任何伺服器)
real-time inference
一次處理一筆預測
1. 建立real-time endpoint
2. app傳送一個payload
3. 設定模型運行時使用的CPU或GPU來執行inference
serverless inference
允許在沒有流量的時期自動停用，但有cold start問題(長時間沒有流量後再有請求進入時，第一次請求的延遲會稍高)
1. 建立serverless endpoint
2. app傳送一個payload
3. 設定模型所需的RAM大小來執行inference
asynchronous inference(非同步推論)
輸入資料非常大，處理時間也會較長
near-real time inference
1. 先將輸入資料上傳至 Amazon S3的staging bucket
2. app再通知asynchronous endpoint，將任務排入queue中進行運算
3. 將inference結果輸出至另一個S3 staging bucket
batch transform
對整個資料集進行多筆預測
1. 先將輸入資料上傳至 Amazon S3的staging bucket
2. app再通知batch transform endpoint，將任務排入queue中進行運算
3. 將inference結果輸出至另一個S3 staging bucket
real-time與serverless同屬於即時應用，用於小型、即時的預測任務，差別在於serverless不需管理任何基礎設施；asynchronous的key word是near-real time，單次推論最長可執行1小時，適合長時間、單筆推論的工作；batch transform具有高延遲，可同時處理多筆資料與並行運算多筆資料，最長可執行1小時