為 AI 植入短期記憶：實作對話上下文

2025 iThome 鐵人賽

DAY 17

Modern Web

前端工程師的AI應用開發實戰：30天從Prompt到Production - 以打造AI前端面試官為例系列第 17 篇

17th鐵人賽 next.js rag gemini

windate3411

2025-10-01 23:26:09

185 瀏覽

分享至

前言

歡迎來到第十七天！昨天我們為 Streaming 體驗加上了「打字機效果」和「取消功能」，讓 AI 面試官的互動感覺更流暢、更可控。整個通訊管道現在可以說是相當穩固了。
然而，我們的 AI 面試官有一個非常致命的弱點——它是「金魚腦」。每一次你按下「提交」，對它來說都是一次全新的、獨立的對話。這種「失憶」的特性，讓我們的面試官無法進行真正有意義的多輪對話。雖然在現在並不是什麼大問題，因為我們都只問一次就停了，但未來功能更完整時這肯定會成為一個大問題。畢竟，誰想跟一個金魚腦面試官面試咧！今天，我們就要來解決這個問題，為它植入「短期記憶」。
這同時也是一個絕佳的時機，讓我們重構並統一先前為「概念題」和「程式題」分別設計的 Prompt，打造一個更強大、更具擴展性的單一指令模板。

今日目標

升級前端請求：在 handleSubmit 時，將過去的對話歷史 (chatHistory) 一併打包發送到後端。
升級提示工程：將對話歷史、RAG 內容、Judge0 結果全部納入統一的 Prompt 中，並指導 AI 如何根據題型調整輸出，進行有上下文的、連貫的對話。

Step 1: 前端 - 將記憶打包送出

第一步維持不變，我們要在前端的請求中，把 chatHistory 這個「當前對話狀態」傳遞給後端。

// interview/[sessionId]/page.tsx

const handleSubmit = async () => {
  // ...
  try {
    abortControllerRef.current = new AbortController();

    const response = await fetch('/api/interview/evaluate', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        questionId: currentQuestion.id,
        answer: answer,
        userId: 'anonymous-user',
        // 【關鍵新增】傳遞當前的對話歷史
        history: chatHistory, 
      }),
      signal: abortControllerRef.current.signal,
    });

    // ... 後續 streaming 處理邏輯保持不變 ...
  } catch (error) {
    // ...
  } 
};

程式碼摘要說明

我們在傳送給後端的 JSON 物件中，新增了一個 history key，其 value 就是 chatHistory state。這一步非常簡單，但卻是實現記憶的基礎，也是一般語言模型的聊天機器人所用的策略，會將一部分的聊天記錄發送到後續的請求中增加回答的準確性，不同的服務在這方面會有些許的策略差別，但多半都是如剛剛所說的，將最近幾則的對話內容一同打包進後面的請求，有些為了節省 token 的數量則會在發送前做類似 compact的壓縮去減少成本，但這也犧牲了一部分的準確性，有些則是利用陣列管理總 token 數量，超過就把最舊的那則聊天記錄彈出（queue的機制），

Step 2: 後端 - 打造統一的記憶與評估中樞

這是今天的核心之一。我們將重構 evaluate API，用一個統一的 Prompt 來處理所有類型的問題，在幾天前的文章中我們將 prompt 分為兩個版本去處理概念題與程式題，主要目的是因為兩種不同的題型需要用到的工具不太一樣，為了避免無意義的調用不需要的工具我們當時做了這樣的考量。
但那樣的作法同時也帶來一些問題，比方說我們今天這樣的修改，想讓 chatHistory 也成為 prompt 一部分的內容時就會需要修改兩個地方，兩者的核心邏輯其實也有不少共用的地方，這是我當時沒想清楚的點，今天就趁這個機會做個整合！我們要回頭修改我們的 evaluate api，請打開 app/api/interview/evaluate/route.ts 檔案並寫入以下的修改。


// ... import 部分 ...
import { ChatMessage } from '@/types'; 

// 輔助函式，用來將 ChatMessage[] 格式化為純文字，並擷取最近的四則訊息
function formatChatHistory(history: ChatMessage[]): string {
  if (!history || history.length === 0) {
    return '無歷史對話紀錄。';
  }
  // 只取最近的 4 則訊息 (約 2 輪對話)，避免 Prompt 過長
  const recentHistory = history.slice(-4);
  return recentHistory
    .map((msg) => {
      const prefix = msg.role === 'user' ? 'User' : 'AI';
      // 我們只關心對話內容，忽略 evaluation 物件
      return `${prefix}: ${msg.content}`;
    })
    .join('\n');
}

// 重構後的統一 Prompt 模板
const unifiedPromptTemplate = `<role>
You are a world-class senior frontend technical interviewer providing a comprehensive evaluation.
</role>
<task>
Carefully analyze the user's answer based on the provided context. Your evaluation must be grounded in the evidence given.

- **If the question is conceptual (i.e., <judge0_result> contains 'not applicable for this question')**:
  - Base your evaluation on how well the <user_answer> aligns with the key points in <rag_context>.
  - The \`grounded_evidence\` field in your JSON response MUST be \`null\`.

- **If the question is a coding challenge (i.e., <rag_context> contains 'not applicable for this question')**:
  - Base your evaluation strictly on the objective <judge0_result> and an analysis of the <user_answer> (which is user's code).
  - The \`grounded_evidence\` field in your JSON response MUST be populated with data from the execution results.

Always refer to the <conversation_history> for dialogue context.
Your response MUST be a single, valid JSON object following the schema. Answer in Traditional Chinese.
</task>
<json_schema>
{
  "summary": "string",
  "score": "number (1-5)",
  "grounded_evidence": { "tests_passed": "number|null", "tests_failed": "number|null", "stderr_excerpt": "string|null" } | null,
  "pros": ["string"],
  "cons": ["string"],
  "next_practice": ["string"]
}
</json_schema>
<conversation_history>
\${formattedHistory}
</conversation_history>
<question>
\${question}
</question>
<rag_context>
\${ragContext}
</rag_context>
<judge0_result>
\${judge0Result}
</judge0_result>
<user_answer>
\${userAnswer}
</user_answer>`;

export async function POST(request: Request) {
    // 1. 從 request body 中解構出 history
    const { questionId, answer, history } = await request.json();
    
    // ... (找到 question 的程式碼) ...

    // 2. 準備所有需要的上下文變數，並給予預設值
    const formattedHistory = formatChatHistory(history);
    let ragContext = 'not applicable for this question';
    let judge0ResultText = 'not applicable for this question';

    // 3. 根據題型，填充對應的內容
    if (question.type === 'concept') {
        // ... 執行 RAG 搜尋，將結果賦值給 ragContext ...
        ragContext =
        !ragError && ragData?.length > 0
          ? ragData.map((d: { content: string }) => `- ${d.content}`).join('\n')
          : 'No relevant context found.';
    } else if (question.type === 'code') {
        // ... 呼叫 Judge0 API，將結果賦值給 judge0ResultText ...
        const judge0Result = await judge0Response.json();
      judge0ResultText = `Status: ${judge0Result.status.description}\nStdout: ${
        judge0Result.stdout || 'N/A'
      }\nStderr: ${judge0Result.stderr || 'N/A'}`;
    }

    // 4. 填充統一的 Prompt 模板
    const finalPrompt = unifiedPromptTemplate
      .replace(/\${formattedHistory}/g, formattedHistory)
      .replace(/\${question}/g, question.question)
      .replace(/\${ragContext}/g, ragContext)
      .replace(/\${judge0Result}/g, judge0ResultText)
      .replace(/\${userAnswer}/g, answer);

    // ... 後續呼叫 Gemini 和回傳 stream 的程式碼 ...
}

程式碼摘要說明

formatChatHistory 輔助函式：這個函式將 chatHistory 物件陣列轉換為簡潔的純文字對話紀錄，並透過 .slice(-4) 只截取最近的兩輪對話，作為一個基礎的 Token 管理機制。
unifiedPromptTemplate：這是我們新的「大腦」。它包含所有可能的上下文區塊（conversation_history, rag_context, judge0_result），讓他可以同時處裡兩個題型並共用一些核心的設定。
上下文變數的準備：在 POST 函式中，我們先初始化 ragContext 和 judge0ResultText 為「不適用於此題型。」。然後，根據題目類型 (concept 或 code)，去填充對應的內容。這樣可以確保 AI 總能收到所有欄位的明確資訊，即使該欄位在當前情境下無關。
填充模板：最後，我們用一系列的 .replace() 將所有變數一次性地填入統一的模板中，產生最終要發送給 Gemini 的 finalPrompt。

這個重構讓我們的後端邏輯更清晰、更具彈性，之後再修改或是擴充面試官能力時也會變得較為輕鬆一點！

今日回顧

這幾天的內容都相當輕鬆，但對於我們的應用程式來說其實都是不小的提升，今天不僅讓 AI 擁有對話間的短期記憶，還重構了其核心的思考框架。

✅ 前端升級：學會了如何在請求中附帶對話上下文。
✅ 後端重構：成功將兩個獨立的 Prompt 合併為一個更強大、更易於維護的統一模板。
✅ 提示工程升級：將所有上下文（歷史、RAG、Judge0）整合在一起，並指導 AI 如何根據題型調整輸出，進行真正連貫、有依據的對話。
✅ 初步的 Token 管理：透過 .slice() 實作了最簡單的上下文長度控制。