如何使用 Zod 與 OpenAI API 解決生成不穩定的問題

16th鐵人賽

Ray 貓

2024-09-08 14:57:21

1403 瀏覽

分享至

穩定輸出：OpenAI API 格式化輸出

任何的軟體工程都會期待一切是可控的，最怕的就是那種窩不知道他會不會某天就炸掉的情況，隨時準備要跑路的提心吊膽。

然而在 AI 應用開發，LLM 可以說是最可怕最不穩定的大變因了，你希望 LLM 給你一個清晰的 JSON 回應，結果卻收到了一個充滿錯誤或缺少必要結構的回應。GPT 的生成結果有時會偏離我們的期望，可能是部分 JSON、也可能完全不符合我們預期的格式，然後用戶就會開始抱怨怎麼炸掉了。

我舉個例子，今天你希望做一個能分析用戶的名字與歲數的 AI
你期望 AI 輸出格式長這樣

{
  "name": "Ray",
  "age": 20
}

但 GPT 有時會生成類似這樣的結果：

{
  name: Ray,
  age: thirty
}

不是，大哥！這個 JSON 根本沒辦法解析阿！。這樣的生成不僅讓你的應用崩潰，也讓開發過程變得複雜不堪。

為了解決這個問題，OpenAI 在上個月初(2024/8)推出了 response_format 可以強制限制 LLM 的輸出格式，算是一個非常新但是非常重要的大更新

Zod 是什麼？為什麼使用 Zod？

在 JavaScript 或 TypeScript 中，型別的驗證是一個常見且重要的需求。這裡就需要提到 Zod，它是一個專門為 JavaScript/TypeScript 設計的資料結構驗證工具。Zod 的主要功能是透過 Schema 定義資料的格式，並在運行時驗證數據是否符合這些格式。

Zod 的好處包括：

運行時驗證：在 TypeScript 中，類型檢查僅存在於編譯階段，但 Zod 能夠在程式執行過程中驗證資料的正確性。
簡潔的語法：Zod 的語法非常直觀，能夠快速定義複雜的結構，讓開發者以最少的代碼完成驗證。
擴展性強：Zod 可以輕鬆定義嵌套物件和自訂錯誤訊息，這對於複雜資料結構尤為重要。

例如，如果你要驗證一個包含姓名和年齡的物件，可以這樣寫：

import { z } from 'zod';

// 宣告一個驗證器，他必須輸入一個物件，這個物件內部有兩個屬性，也就是必須是 string 的 name 與必須是數字的 age
const Person = z.object({
    name: z.string(),
    age: z.number(),
});

// 宣告我們要檢查的實際資料
const rayPerson = { name: 'Ray', age: 30 }

// 透過這個驗證器，檢查是否該資料完全符合格式
const result = Person.safeParse(rayPerson);

if (!result.success) {
    console.error('驗證失敗', result.error);
} else {
    console.log('資料正確', result.data);
}

這裡，Zod 幫助我們確保資料結構正確，並提供直觀的錯誤回報。

如何與 OpenAI API 結合？

有了 Zod 的幫助，我們可以強制 OpenAI 生成的回應符合我們定義的資料格式。OpenAI 提供了 response_format 參數，這個功能讓我們能夠使用 Zod 定義的 Schema 驗證生成結果。

程式碼範例與解說

以下範例展示了如何使用 OpenAI API 和 Zod 搭配，來解決數學問題並生成結構化的回應：

import OpenAI from 'openai';
import { zodResponseFormat } from 'openai/helpers/zod';
import { z } from 'zod';

// 定義每個數學步驟的結構
const Step = z.object({
    explanation: z.string(),  // 每個步驟的解釋
    output: z.string(),       // 該步驟的計算結果
});

// 定義最終的數學回應結構
const MathResponse = z.object({
    steps: z.array(Step),      // 數學解題過程的步驟
    final_answer: z.string(),  // 最終答案
});

const client = new OpenAI();

async function solveMathProblem() {
    // 發送請求給 OpenAI API，並指定 response_format 使用 Zod 來驗證結果
    const completion = await client.beta.chat.completions.parse({
        model: 'gpt-4o-2024-08-06',
        messages: [
            {
                "role": "system",
                "content": "You are a helpful math tutor. Only use the schema for math responses.", // 設定 AI 的角色
            },
            { "role": "user", "content": "solve 8x + 3 = 21" }, // 使用者輸入數學問題
        ],
        // 透過 Zod 驗證模型輸出的資料結構
        response_format: zodResponseFormat(MathResponse, 'mathResponse'),
    });

    // 輸出
    const result = completion.choices[0]?.message?.parsed;
    console.log(result)
}

solveMathProblem();

程式碼解析：

Zod Schema：定義好 GPT 必須使用什麼格式回應我們。
response_format：這是重點部分，我們使用 zodResponseFormat 這個 OpenAI 提供的功能，他可以來將我們定義的 Zod schema 應用到 OpenAI API 的回應中，確保回應的格式正確。
result：最後生成好的物件會被放在 completion.choices[0]?.message?.parsed 中，可以直接調用

輸出

當然，GPT 會回應你一個非常大的 JSON 裏頭紀載著使用的模型、時間、完整的資訊，你只需要知道，GPT 生成的資料會被存在在
completion.choices[0]?.message?.parsed;
這個地方，他甚至會幫你解析好成可以直接運作的 object
好比說

{
  "id": "chatcmpl-A563chZUorZs2Xhq5mO1QT2qoEWHu",
  "object": "chat.completion",
  "created": 1725778288,
  "model": "gpt-4o-2024-08-06",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "{\"steps\":[{\"explanation\":\"Start with the equation and move the constant on the left to the right side.\",\"output\":\"8x + 3 = 21\"},{\"explanation\":\"Subtract 3 from both sides to isolate the term with x.\",\"output\":\"8x = 21 - 3\"},{\"explanation\":\"Calculate the right side to simplify the equation.\",\"output\":\"8x = 18\"},{\"explanation\":\"Divide both sides by 8 to solve for x.\",\"output\":\"x = 18 / 8\"},{\"explanation\":\"Simplify the fraction by dividing both the numerator and the denominator by their greatest common divisor, which is 2.\",\"output\":\"x = 9 / 4\"}],\"final_answer\":\"x = 9/4\"}",
        "refusal": null,
        "tool_calls": [],
        "parsed": {
          "steps": [
            {
              "explanation": "Start with the equation and move the constant on the left to the right side.",
              "output": "8x + 3 = 21"
            },
            {
              "explanation": "Subtract 3 from both sides to isolate the term with x.",
              "output": "8x = 21 - 3"
            },
            {
              "explanation": "Calculate the right side to simplify the equation.",
              "output": "8x = 18"
            },
            {
              "explanation": "Divide both sides by 8 to solve for x.",
              "output": "x = 18 / 8"
            },
            {
              "explanation": "Simplify the fraction by dividing both the numerator and the denominator by their greatest common divisor, which is 2.",
              "output": "x = 9 / 4"
            }
          ],
          "final_answer": "x = 9/4"
        }
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 108,
    "completion_tokens": 152,
    "total_tokens": 260
  },
  "system_fingerprint": "fp_8e1177b306"
}