Day7: perplexity api 初探與 fact_ckecher 功能測試

17th鐵人賽

poyuanchih

2025-09-22 17:23:50

184 瀏覽

分享至

Situation:

我們想要獲得成對的 (context, question, answer) 並做成資料集，以供後續 RAG 方法論的驗證
先前的文章(day5)已經完成了主題 1.1: 從 context 生成 (question, answer, context) pair
當前的主要思路是我們改從 question 出發，呼叫可以做 web search 的 llm 來替我們產生 (context 與 answer)
- 昨天(day6: pdf2txt) 我們探索了從 pdf parser 得到考題(question)
- 今天來看一下可以做 web search 的 llm

Task:

說到可以查網路的 llm 我第一個想到的是 perplexity ，我們今天來探索一下 perplexity 的 API

Action:

perplexity 的 API Key 可以去官網申請，不過要填信用卡，我目前沒有看到有免費額度的選項，~~所以為了產生這篇文章，我花了 NTD 100 $ 儲值~~
- 申請好 API Key 之後，在專案的 .env 加上 PPLX_API_KEY=xxxxx
在 llama_index 的 examples 上，我們可以找到 perplexity 的範例用法: 這裡

setup

import os
from dotenv import find_dotenv, load_dotenv
_ = load_dotenv(find_dotenv())

PPLX_API_KEY = os.getenv("PPLX_API_KEY")

from llama_index.llms.perplexity import Perplexity

llm = Perplexity(api_key=PPLX_API_KEY, model="sonar-pro", temperature=0.0)

首先從 .env 讀取 perplexity 的 API Key
接著用 LlamaIndex 的 Perplexity 初始化 llm，model 用的是 sonar-pro

ChatMessage

# Import the ChatMessage class from the llama_index library.
from llama_index.core.llms import ChatMessage

# Create a list of dictionaries where each dictionary represents a chat message.
# Each dictionary contains a 'role' key (e.g., system or user) and a 'content' key with the corresponding message.
messages_dict = [
    {"role": "system", "content": "使用繁體中文回復使用者"},
    {
        "role": "user",
        "content": "可以告訴我特斯拉的股價嗎?",
    },
]

# Convert each dictionary in the list to a ChatMessage object using unpacking (**msg) in a list comprehension.
messages = [ChatMessage(**msg) for msg in messages_dict]

# Print the list of ChatMessage objects to verify the conversion.
print(messages)

我們做了一個 LlamaIndex 的 list of ChatMessage，包含了 system prompt 與 user prompt

[ChatMessage(role=<MessageRole.SYSTEM: 'system'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text='使用繁體中文回復使用者')]), ChatMessage(role=<MessageRole.USER: 'user'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text='可以告訴我特斯拉的股價嗎?')])]

call

response = llm.chat(messages)
print(f"response type: {type(response)}")
print(response)

下面是response:

response type: <class 'llama_index.core.base.llms.types.ChatResponse'>

assistant: 截至美東時間2025年9月19日收盤，**特斯拉（TSLA）股價為426.07美元**，單日漲幅約2.21%[1]。

補充資訊：
- 近期特斯拉股價波動較大，2025年初至今已較2024年末高點下跌約40%，主要原因包括歐洲和中國銷量下滑、馬斯克個人爭議及競爭加劇[4]。
- 特斯拉目前市值約為**1.417萬億美元**[1]。
- 投資人關注特斯拉未來自動駕駛出租車（Robotaxi）等新業務，市場仍期待其技術創新帶來新成長[4]。

如需即時股價，建議查詢美股交易平台或金融新聞網站，因股價會隨市場即時波動。

回傳是 llama_index 的 ChatResponse
人工驗證這個回傳確實是真的。

ChatResponse.message

def get_response_message(response):
    rv = {}

    messages = response.message
    rv['role'] = messages.role
    num_blocks = len(response.message.blocks)
    blocks = []
    for block_idx in range(num_blocks):
        block = messages.blocks[block_idx]
        blocks.append({
            'block_type': block.block_type,
            'text': block.text
        })
    rv['blocks'] = blocks
    return rv

def get_response_raw(response):
    rv = {}

    raw = response.raw  # dictionary
    rv['model'] = raw['model']
    rv['num_urls'] = len(raw['citations'])
    rv['search_results'] = raw['search_results']  # 'title', 'url', 'date', last_updated, snippet
    rv['urls'] = raw['citations']
    return rv

rv = get_response_message(response)
rv['blocks'][0]['text'].replace('\n', '').split('。')

['截至2025年9月21日，特斯拉（TSLA）股價為426.070美元，總市值約為1.417萬億美元[3]',
'根據最新財報，特斯拉今年上半年總收入為418億美元，較去年同期下滑11%，歸母淨利潤同比下跌30%[3]',
'近期特斯拉股價波動明顯，9月19日收盤時漲超2%，但在9月18日則跌超2%[1]',
'特斯拉的市盈率目前高達180.7倍，遠高於納斯達克100科技指數和輝達等科技巨頭[2]',
'部分分析認為，若盈利持續萎縮，股價可能面臨較大下行壓力[2]',
'如需即時查詢特斯拉股價，建議參考美股交易平台或主流財經網站',
'']

首先是 ChatResponse 主要就是 message 跟 raw
回傳會是帶有 citation 的結果
- 這種結果應該可以用 llamaIndex 的 citation_query_engine 做出來

ChatResponse.raw

get_response_raw(response)

{'model': 'sonar-pro',
'num_urls': 10,
'search_results': [{'title': '2025年9月22日美股上市公司最新市值前十排名榜',
'url': 'https://usstock.cngold.org/c/2025-09-22/c10089255.html',
'date': '2025-09-22',
'last_updated': '2025-09-22',
'snippet': '2025年9月22日美股上市公司最新市值前十排名榜 ... 美东时间上周五（9月19日）收盘，美股市值排名前十公司来看，苹果涨超3%，特斯拉涨超2%，微软、谷歌-C、谷歌-A ...'},
{'title': '為什麼說特斯拉的股價有可能會暴跌70%？ | 美股 - 鉅亨號',
'url': 'https://hao.cnyes.com/post/184164',
'date': '2025-07-28',
'last_updated': '2025-09-12',
'snippet': '特斯拉第二季度的每股收益（EPS）同比下降了18%，第一季度下降了71%。其過去12個月的每股收益目前為1.67美元，使其股票的市盈率達到令人瞠目的180.7倍。這 ...'},
{'title': '特斯拉，大消息！ - 每日经济新闻',
'url': 'https://www.nbd.com.cn/articles/2025-09-22/4068955.html',
'date': '2025-09-22',
'last_updated': '2025-09-22',
'snippet': '据红星资本局，因FSD（完全自动驾驶）功能未兑现，多名车主起诉特斯拉（TSLA，股价：426.070美元；总市值：1.417万亿美元）欺诈，要求赔款事件有新进展。 9月21日， ...'},
{'title': '特斯拉股價10年大漲百倍,Tesla股票值得買嗎？怎麼買特斯拉？',
'url': 'https://www.mitrade.com/zh/insights/shares/us-stock-recommendation/buy-tesla',
'date': '2025-03-06',
'last_updated': '2025-09-22',
'snippet': '當前TSLA股價約為275美元，即特斯拉股票一張9028新台幣。 Mitrade平台提供特斯拉股票（TSLA.US），最低交易量為0.5手（0.5股），最高槓桿為10倍 ...'},

raw 主要就是有 search_results 包含了 url, title, date, last_updated 和 snippet
這邊看起來這個 snippet 沒有包含完整的 response 的資訊，所以 perplexity 應該看了除了這邊的 snippet 以外更多的資訊才可以正確回答，也有可能這邊的 snippet 沒有給完整
有些 urls 主要是舊的資訊，不過模型並沒有參考
我用中文和英文分別搜過一次，體感來說英文的連結比較多是直接參考股價的網站而不是新聞
這樣看來我們應該沒辦法直接用 perplexity 來造出我們要的 Dataset，除非我們想辦法去從他給的 URL 來抓內容，不過這樣似乎就沒必要特別用 perplexity

我們下面看一下 perplexity 官方的其他範例
- 這個: fact_ckecher.py
- 我們來用 llama-index 包好的 perplexity 實作一次

首先是 ChatMessage，包含範例的 system prompt 、 user prompt 以及~~我們對未來的期望~~

system_prompt = (
    "You are a professional fact-checker with extensive research capabilities. "
    "Your task is to evaluate claims or articles for factual accuracy. "
    "Focus on identifying false, misleading, or unsubstantiated claims."
)

# 目前還是假新聞，五年後不知道
text = (
    "特斯拉已經漲到1700美元，手握 280 股的投資人財富自由啦！"
)

user_prompt = f"Fact check the following text and identify any false or misleading claims:\n\n{text}"

call it!

response = llm.chat(messages)
print(f"response type: {type(response)}")
print(response)

這邊是回傳

assistant: 這段話「特斯拉已經漲到1700美元，手握 280 股的投資人財富自由啦！」包含兩個主要可查證的事實陳述：

1. **特斯拉已經漲到1700美元**
2. **手握280股的投資人財富自由**

### 1. 特斯拉已經漲到1700美元

這一說法**不正確**。根據2025年9月22日的最新美股資訊，特斯拉近期雖然有上漲，但**並未達到1700美元**。主流金融媒體和即時市值榜單均未顯示特斯拉股價接近或突破1700美元[1][3]。目前券商如派珀·桑德勒（Piper Sandler）對特斯拉的目標價僅上調至500美元[3]，遠低於1700美元。

此外，關於「1700美元」的數字，來源多為**未來預測**，例如方舟投資（Ark Invest）等機構對2029年或2030年的樂觀預估[2]，而非現實股價。因此，將預測數字誤當作現價，是**誤導性陳述**。

### 2. 手握280股的投資人財富自由

這一說法**屬於誇張或主觀判斷**，並非事實陳述。根據媒體報導，這種說法來自YouTuber或投資社群的討論，假設特斯拉未來能達到1700美元，280股價值約1428萬新台幣（約45萬美元），然後全數投入美股ETF，假設年化報酬率8%，即可「財富自由」[2]。

但這裡有幾個問題：
- **1700美元是預測值，不是現價**[2]。
- **財富自由的定義因人而異**，45萬美元是否足夠取決於個人生活標準、地區、支出等。
- 這種說法**忽略了投資風險**，特斯拉股價波動極大，過去幾年曾多次腰斬[2]。

### 結論

- **「特斯拉已經漲到1700美元」為錯誤陳述**，目前股價遠低於此數字[1][3]。
- **「手握280股的投資人財富自由」為誇大或主觀說法**，且建立在不確定的未來預測和個人定義之上[2]。

這段話屬於**誤導性或不實資訊**，不應作為投資決策依據。

確認一下 raw info

get_response_raw(response)

我其實覺得效果還頗驚豔，因為這個單純去網路上查，就是真的會查到相關的言論，那代表他分得清楚預期與事實

最後我們用 json mode 再做一次，這個主要就是 follow 官方的範例

from pydantic import BaseModel, Field
from typing import Dict, List, Optional, Any

class Claim(BaseModel):
    """Model for representing a single claim and its fact check."""
    claim: str = Field(description="The specific claim extracted from the text")
    rating: str = Field(description="Rating of the claim: TRUE, FALSE, MISLEADING, or UNVERIFIABLE")
    explanation: str = Field(description="Detailed explanation with supporting evidence")
    sources: List[str] = Field(description="List of sources used to verify the claim")


class FactCheckResult(BaseModel):
    """Model for the complete fact check result."""
    overall_rating: str = Field(description="Overall rating: MOSTLY_TRUE, MIXED, or MOSTLY_FALSE")
    summary: str = Field(description="Brief summary of the overall findings")
    claims: List[Claim] = Field(description="List of specific claims and their fact checks")

llm = Perplexity(
    api_key=PPLX_API_KEY,
    model="sonar-pro",
    temperature=0.0,
    additional_kwargs={
        "response_format": {
            "type": "json_schema",
            "json_schema": {"schema": FactCheckResult.model_json_schema()},
        }
    },
)

response = llm.chat(
    messages=messages,
)

from pprint import pprint
pprint(response.message.blocks[0].text)

('{"overall_rating":"MOSTLY_FALSE","summary":"The claim that Tesla has already '
 'reached $1,700 per share as of September 22, 2025 is false. The assertion '
 'that holding 280 shares guarantees financial freedom is misleading, as it is '
 'based on speculative future price targets and ignores investment '
 'risks.","claims":[{"claim":"特斯拉已經漲到1700美元 (Tesla has already risen to $1,700 '
 'per share)","rating":"FALSE","explanation":"There is no evidence that '
 "Tesla's stock price has reached $1,700 as of September 22, 2025. Recent "
 "reports indicate Tesla's stock price increased by about 2% in the past week, "
 'but analyst price targets are much lower, with Piper Sandler raising its '
 'target to $500. Forecasts of $1,700 or higher are speculative and refer to '
 'possible future scenarios, not current '
 'prices.","sources":["[1]","[3]"]},{"claim":"手握 280 股的投資人財富自由啦！ (Holding 280 '
 'shares of Tesla means financial '
 'freedom!)","rating":"MISLEADING","explanation":"The idea that owning 280 '
 'shares of Tesla guarantees financial freedom is based on speculative future '
 'price targets and assumptions about future returns. While some analysts and '
 'AI models predict Tesla could reach $1,700 or more in several years, these '
 'are not guarantees and ignore the risks and volatility associated with Tesla '
 'stock. The claim oversimplifies the path to financial freedom and is not '
 'supported by current facts.","sources":["[2]"]}]}')

他會自己去分出說待查核資訊裡面有多少個 claim，然後總結一下驗證結果

包成 llama-index 的 tools

這個在 llama-index 的 perplexity 文檔，總之就是把 query_perplexity 包成 function，然後再用 FunctionTool.from_defaults 做成 tool來用，後續我們應該也會這樣做，

summary

我們今天試著用 llama-index 呼叫 perplexity 的 API
- 回傳主要是帶有 citation 的 response，以及對應的 citation(url, title, snippet)
- 回傳的 snippet 個人感覺沒有完整到可以直接當成 context，所以我們還要再找其他方法造資料
接著我們探索了 perplexity 的 fact_checker
- 以結果來說我覺得頗驚艷
- 最後我們測試了 perplexity 的 json mode

其他

這幾天 LlamaIndex 的官方文檔都怪怪的，後續要放參考資料的時候我們就盡量改用官方 github 的連結，期望這樣會比較穩定
今天這樣亂 call 的結果大概 3.33 NTD，包含一些中間有的沒的的試錯，所以後面應該可以爆call一次fact check.
perplexity 官方的範例還有一些有趣的應用比如 disease Information APP，大致上來說就是 query 疾病，然後會給一些跟這個疾病相關的資訊(overview, causes, treatments, citations)
我們今天主要用到的就只是普通的 search model，他還有 Reasoning 跟 Research Model 等著我們去探索...