Day 12 我們已經把影像管線搬進 Web Worker,讓主執行緒只管 UI,所有運算交給 Worker,不卡 UI。但單工有個問題:如果你在 UI 快速連續拖動亮度滑桿,主執行緒會一直丟任務進 Worker,而 Worker 只能一個一個跑完,導致前幾筆運算白白浪費(算完結果早就過期),UI 反而延遲。
今天要升級成 多工 + 佇列:
這樣狂按 UI 的時候,前面不會堆積;如果電腦有多核心,也能更快出結果。
id。// demo/src/workerPool.ts
export class WorkerPool {
  private workers: Worker[]
  private idle: Worker[] = []
  private queue: { id: number; w: number; h: number; ops: unknown[]; bytes: Uint8Array; resolve: (out: Uint8Array)=>void; reject:(err:any)=>void }[] = []
  private lastId = 0
  constructor(size: number) {
    this.workers = Array.from({ length: size }, () => new Worker(new URL('./worker.ts', import.meta.url), { type: 'module' }))
    this.idle = [...this.workers]
    // 每個 worker 設置 listener
    this.workers.forEach(w => {
      w.onmessage = (ev: MessageEvent<any>) => {
        const data = ev.data
        if (data?.ok) {
          const job = this.queue.shift()  // 拿對應任務
          job?.resolve(data.bytes as Uint8Array)
        } else {
          job?.reject(data?.error)
        }
        this.idle.push(w)
        this.schedule() // 看還有沒有排隊任務
      }
      w.postMessage({ type: 'init' })
    })
  }
  run(w: number, h: number, ops: unknown[], bytes: Uint8Array) {
    return new Promise<Uint8Array>((resolve, reject) => {
      const id = ++this.lastId
      this.queue.push({ id, w, h, ops, bytes, resolve, reject })
      this.schedule()
    })
  }
  private schedule() {
    if (this.idle.length === 0 || this.queue.length === 0) return
    const wkr = this.idle.pop()!
    const job = this.queue[0]! // 取第一筆
    wkr.postMessage({ type: 'run', w: job.w, h: job.h, ops: job.ops, bytes: job.bytes })
  }
}
// demo/src/main.ts
import { WorkerPool } from './workerPool'
const pool = new WorkerPool(2)  // 開 2 個 worker
async function runPipeline(ops: unknown[]) {
  if (!w || !h) return
  const img = ctx.getImageData(0, 0, w, h)
  const input = new Uint8Array(img.data.buffer)
  try {
    const out = await pool.run(w, h, ops, input)
    img.data.set(out)
    ctx.putImageData(img, 0, 0)
  } catch (e) {
    showWasmError(e)
  }
}
我用同一張大圖、同一組管線 { grayscale → bc(40,60) → blur(r=3) } 做兩組量測:丟 8 個任務與 20 個任務,分別測 pool size = 1 / 2 / 4。以 avg_ms(平均單次耗時)與 p95_ms(95 百分位)觀察:
jobs=8
size=1:avg ≈ 3.0–4.8s,p95 ≈ 4.7–6.6s
size=2:avg 降到 ≈ 2.9–3.1s,p95 也略降(≈ 4.5–4.8s)
size=4:有時反而變慢(avg 可飆到 8s+,p95 10s+)
推測從 1→2 條 Worker 有助益,但再擴到 4 條時,排程與傳輸開銷抵銷了平行效益。
jobs=20
size=1/2/4 三組幾乎同速:avg 都落在 ≈ 6.5s,p95 ≈ 11.6s
粗估這組工作是 memory-bound(每個 pass 都要全圖掃一遍),瓶頸是記憶體頻寬與資料搬運,新增 Worker 對總時間的幫助有限;多工主要提升「主線不卡」與「尾延遲穩定度」,不是線性加速。
