⚡《AI 知識系統建造日誌》這不是一篇純技術文章,而是一場工程師的魔法冒險。程式是咒語、流程是魔法陣、錯誤訊息則是黑暗詛咒。請準備好你的魔杖(鍵盤),今天,我們要踏入魔法學院的基礎魔法課,打造穩定、可擴展的 AI 知識系統。
昨天,我們成功完成了 每日知識快遞訂閱服務,將抓取最新論文、生成摘要、翻譯、HTML 格式化以及自動寄送整合成一套完整流程。三層架構清晰:
Pipeline → Services → Storage
今天,我們要深入探討這個魔法系統中最核心的部分:如何讓 LLM 幫我們生成精準、易讀的摘要與郵件內容。這是讓每日知識快遞真正對使用者「有感」的關鍵魔法。
在 Pipeline 裡,最核心的就是 如何請 LLM 幫我們產生合適的摘要與郵件內容。
也就是Day 9|Email Pipeline 技術拆解(下) - 打造訂閱系統 提到的 generate_summary
實際上線後,遇到一些問題
llama3.2:3b
)會直接罷工。跟我說 I can't fulfill this request.
因此,我們採取了今天的解方:將流程拆成三階段:
這樣比一次塞一大坨 prompt 更穩,也能減少 token 爆炸的風險。
summarize_paper
summarize_paper
def summarize_paper(paper_info: dict, user: dict, retries: int = 3) -> str:
"""呼叫 LLM 生成摘要,若失敗則 fallback"""
summary = ""
while len(summary) < 100 and retries > 0:
summary = llm_summary(paper=paper_info, user=user, max_words=1000)
if summary and isinstance(summary, str) and len(summary) >= 100:
break
retries -= 1
if len(summary) < 100:
logger.warning(f"Summary seems too short: {summary}")
return summary
llm_summary
PROMPT_FILE = pathlib.Path(__file__).parent / "prompt_template.txt"
def llm_summary(paper: Dict, user: dict, max_words: int = 300) -> str:
if not paper:
return "No paper provided."
temperature = min(0.2, user.get("temperature", 0.5))
title = paper.get("title", "No Title")
authors = ", ".join(paper.get("authors") or [])
authors_str = ", ".join([a.replace("{", "{{").replace("}", "}}") for a in authors])
content = paper.get("raw_content") or paper.get("abstract", "")
content_type = "Full Content" if paper.get("raw_content") else "Abstract"
# 讀取 prompt template
template_text = PROMPT_FILE.read_text(encoding="utf-8")
# GPT-OSS 上限大約是 8192 tokens
# Prompt 固定部分 = 250
# Title + Authors = 40
MAX_CONTENT_TOKENS = 6000 # 算出的最大 tokens
AVG_TOKEN_LEN = 4.5 # 每 token 平均字元數
max_content_chars = int(MAX_CONTENT_TOKENS * AVG_TOKEN_LEN)
content = content[:max_content_chars]
prompt_template = template_text.format(
max_words=max_words,
content_type=content_type,
title=title,
authors=authors_str,
content=content,
)
chat_model = ChatOllama(
model=settings.SUMMARY_MODEL_NAME,
temperature=temperature,
base_url=settings.OLLAMA_API_URL,
request_kwargs={"timeout": 300}, # timeout 秒數
reset_context=True, # ⚡每次都清掉 session
)
try:
resp = chat_model.invoke(prompt_template)
summary = resp.content.strip()
summary = "\n".join([line for line in summary.splitlines() if line.strip()])
return summary
except Exception as e:
return f"<p><strong>Summary generation failed:</strong> {e}</p>"
摘要prompt
You are a professional research assistant.
Summarize the following paper concisely, in no more than {max_words} words.
Keep it readable for an email newsletter.
(Note: the text provided is the paper's {content_type})
Instructions:
1. Base your answer STRICTLY on the provided paper excerpts.
2. Maintain academic accuracy and precision.
3. Structure your answer logically with clear paragraphs when appropriate.
4. DO NOT include any introductory paragraphs about the authors, affiliations, or background. Focus ONLY on the paper's content, key findings, methods, and important points.
Remember:
- Do NOT make up information not present in the excerpts.
- Do NOT use knowledge beyond what's provided in the paper excerpts.
- Always acknowledge uncertainty when the excerpts are ambiguous or incomplete.
- Prioritize relevance and clarity in your response.
Paper:
Title: {title}
Authors: {authors}
Content:
{content}
Include key findings, methods, and any important points as bullet points or numbered lists.
llm_translate
def llm_translate(user: dict, summary: str) -> str:
is_translate = user.get("translate", False)
if not is_translate:
return summary
user_language = user.get("user_language", "English")
temperature = min(user.get("temperature", 0.5), 0.2)
translation_instruction = (
f"SUMMARIZE AND TRANSLATE THE FOLLOWING PAPER INTO {user_language.upper()} ONLY. "
"Do NOT output English under any circumstances.",
summary,
)
chat_model = ChatOllama(
model=""gpt-oss:20b"",
temperature=temperature,
base_url="http://ollama:11434",
request_kwargs={"timeout": 300}, # timeout 秒數
reset_context=True, # ⚡每次都清掉 session
)
try:
resp = chat_model.invoke(translation_instruction)
trans_summary = resp.content.strip()
trans_summary = "\n".join(
[line for line in trans_summary.splitlines() if line.strip()]
)
# fallback 檢查
if trans_summary.lower().startswith("i can't"):
return f"[Fallback] 無法完整翻譯,原文保留:\n{summary}"
return trans_summary
except Exception as e:
return f"[Fallback] 翻譯失敗: {e}\n原文:\n{summary}"
翻譯 prompt
translation_instruction = (
f"SUMMARIZE AND TRANSLATE THE FOLLOWING PAPER INTO {user_language.upper()} ONLY. "
"Do NOT output English under any circumstances.",
summary,
)
fallback
except Exception as e:
return f"[Fallback] 翻譯失敗: {e}\n原文:\n{summary}"
format_html
def format_html(
paper_info: dict,
idx: int,
summary: str,
) -> str:
pdf_url = paper_info.get("pdf_url")
pdf_link_html = (
f'<a href="{pdf_url}" target="_blank">Preview PDF</a>' if pdf_url else "N/A"
)
summary = llm_html_foramt(summary)
return f"""
<div class="paper-summary">
<div class="paper-title">{idx}. {paper_info["title"]}</div>
<div class="paper-meta">
<strong>Authors:</strong> {", ".join(paper_info.get("authors", []))} <br>
<strong>Published:</strong> {paper_info.get("published_date", "N/A")} <br>
<strong>PDF:</strong> {pdf_link_html}
</div>
<div class="paper-abstract">
{summary}
</div>
</div>
"""
llm_html_foramt
def llm_html_foramt(summary: str) -> str:
if not summary or not isinstance(summary, str):
return "Summary not available."
chat_model = ChatOllama(
model=settings.SUMMARY_MODEL_NAME,
temperature=0.0,
base_url=settings.OLLAMA_API_URL,
request_kwargs={"timeout": 300}, # timeout 秒數
reset_context=True, # ⚡每次都清掉 session
)
html_instruction = (
"Please convert the following summary into a well-structured HTML format suitable for email newsletters. "
"Use appropriate HTML tags such as <p>, <strong>, <em>, and <ul>/<li> for lists. "
"Ensure the HTML is clean and free of any unnecessary tags or attributes. "
"Do not include any CSS or JavaScript. Only provide the HTML content.",
summary,
)
try:
resp = chat_model.invoke(html_instruction)
html_summary = resp.content.strip()
return html_summary
except Exception as e:
return f"<p><strong>HTML formatting failed:</strong> {e}</p>"
HTML 格式化 prompt
html_instruction = (
"Please convert the following summary into a well-structured HTML format suitable for email newsletters. "
"Use appropriate HTML tags such as <p>, <strong>, <em>, and <ul>/<li> for lists. "
"Ensure the HTML is clean and free of any unnecessary tags or attributes. "
"Do not include any CSS or JavaScript. Only provide the HTML content.",
summary,
)
def generate_summary(
papers_and_content: tuple[list[dict], dict[str, str]], user: dict
) -> str:
"""
將每篇論文生成 LLM 摘要,並整理成 HTML
"""
logger = get_run_logger()
start = time.time()
papers, content_map = papers_and_content
if not papers:
logger.info("No papers to summarize.")
return "<p>No new papers today.</p>"
logger.info(f"Generating summary for {len(papers)} papers...")
papers_html = ""
for idx, p in enumerate(papers, start=1):
paper_info = fetch_paper_info(p, content_map)
# Stage 1: 摘要
summary = summarize_paper(paper_info, user)
# Stage 2: 翻譯
summary = llm_translate(user, summary)
# Stage 3: HTML 格式化
papers_html += format_html(paper_info, idx, summary)
# 美麗配方 template.html
template_path = pathlib.Path(__file__).parent / "template.html"
template_text = template_path.read_text(encoding="utf-8")
final_html = Template(template_text).substitute(papers_html=papers_html)
return final_html
有時候 LLM 會回太短或卡住,所以我們加了 retry:
def summarize_paper(paper, user, retries=3):
summary = ""
while len(summary) < 100 and retries > 0:
summary = llm_summary(paper=paper, user=user)
retries -= 1
return summary
有時會在 ollama log 顯示
time=2025-09-19T02:14:50.027Z level=WARN source=runner.go:160 msg="truncating input prompt" limit=8192 prompt=15823 keep=4 new=8192
以 gpt-oss:20b
為例:
解法:
有些模型(如 llama3.2:3b
)會直接回覆:
「I can’t fulfill this request...」
解法:
gpt-oss:20b
)就是香~fallback
try:
resp = chat_model.invoke(translation_instruction)
trans_summary = resp.content.strip()
trans_summary = "\n".join(
[line for line in trans_summary.splitlines() if line.strip()]
)
# fallback check
if trans_summary.lower().startswith("i can't"):
return f"[Fallback] 無法完整翻譯,原文保留:\n{summary}"
return trans_summary
except Exception as e:
return f"[Fallback] 翻譯失敗: {e}\n原文:\n{summary}"
Pipeline (pipeline.py)
├── Services (抓論文 / 摘要 / 發信)
├── Storage (MinIO, Qdrant)
└── Config & Utils (logger, Firebase)
到這裡,每日知識快遞算是完成了:
下一步可以加: