我們昨天配置了 Langfuse 讓我們可以在 Langfuse 上看到 LlamaIndex 的 QueryEngine.query 之後,底下究竟發生了哪些事。
今天的範例在 Langfuse 上的結果在這裡
今天我們要更進一步的探索一下這個複雜的 TracingTree 是怎麼來的。
from llama_index.core.instrumentation import get_dispatcher
from llama_index.core.instrumentation.span_handlers import SimpleSpanHandler
# root dispatcher
root_dispatcher = get_dispatcher()
# register span handler
simple_span_handler = SimpleSpanHandler()
root_dispatcher.add_span_handler(simple_span_handler)
from dotenv import find_dotenv, load_dotenv
from langfuse import get_client
_ = load_dotenv(find_dotenv())
langfuse = get_client()
from openinference.instrumentation.llama_index import LlamaIndexInstrumentor
# Initialize LlamaIndex instrumentation
LlamaIndexInstrumentor().instrument()
from llama_index.core import Document
text_list = [
'Langfuse is an open source LLM engineering platform to help teams collaboratively debug, analyze and iterate on their LLM Applications. '
'With the Langfuse integration, you can track and monitor performance, traces, and metrics of your LlamaIndex application.'
'Detailed traces of the context augmentation and the LLM querying processes are captured and can be inspected directly in the Langfuse UI.',
'Langfuse 真香',
'The instrumentation module (available in llama-index v0.10.20 and later) is meant to replace the legacy callbacks module.',
'Listed below are the core classes as well as their brief description of the instrumentation module: '
'Event — represents a single moment in time that a certain occurrence took place within the execution of the application’s code.'
'EventHandler — listen to the occurrences of Event’s and execute code logic at these moments in time.'
'Span — represents the execution flow of a particular part in the application’s code and thus contains Event’s.'
'SpanHandler — is responsible for the entering, exiting, and dropping (i.e., early exiting due to error) of Span’s.'
'Dispatcher — emits Event’s as well as signals to enter/exit/drop a Span to the appropriate handlers.',
]
print(f"len of text_list: {len(text_list)}")
documents = [Document(text=t) for t in text_list]
from llama_index.llms.openai import OpenAI
llm = OpenAI(model="gpt-5-mini", temperature=0.0)
from llama_index.core import VectorStoreIndex
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(llm=llm)
query = '介紹一下 LlamaIndex 的 instrumentation module?'
response = query_engine.query(query)
simple_span_handler.print_trace_trees()
結果為:
SentenceSplitter.__call__-4b1835eb-4969-4b55-a794-862a1c895198 (0.004439)
└── SentenceSplitter._parse_nodes-0039126f-da8c-402a-a2c2-22fa4cd028fe (0.003439)
├── SentenceSplitter.split_text_metadata_aware-348258d2-4a2e-427a-8a07-10ff7e37241c (0.001045)
├── SentenceSplitter.split_text_metadata_aware-3b67633d-87ce-43d7-ad57-0e7b375390fd (0.000247)
├── SentenceSplitter.split_text_metadata_aware-e8225c5f-960a-4db8-89bd-2464afd5301e (0.000364)
└── SentenceSplitter.split_text_metadata_aware-ab1fef1a-0dbd-452e-8176-f85ab307777d (0.000182)
OpenAIEmbedding.get_text_embedding_batch-b0dee3ec-12e0-4fd0-9f1b-dc43f3c2bd9c (1.307544)
RetrieverQueryEngine.query-8e73a465-148f-4d31-bd49-d26ff7238f9a (11.404676)
└── RetrieverQueryEngine._query-32ad03a1-c7a7-4981-9f3f-9b3d362daf65 (11.403766)
├── VectorIndexRetriever.retrieve-95f973b5-e10f-4d57-b0da-c3712f17b1ec (1.280278)
│ └── VectorIndexRetriever._retrieve-7437f3f0-acf7-4bef-bea3-f8e29bc63ee8 (1.279463)
│ └── OpenAIEmbedding.get_query_embedding-a7437a95-af35-4414-a3ca-7e2d06fb45bf (1.277606)
│ └── OpenAIEmbedding._get_query_embedding-4dafed62-ca6e-4f35-b993-bd6b21670dc4 (1.276776)
└── CompactAndRefine.synthesize-a15c5505-c257-4c52-a9c9-2117eef8dc91 (10.122762)
└── CompactAndRefine.get_response-0c8d0285-0dbe-45c4-b95f-2090b9ebd100 (10.119841)
├── TokenTextSplitter.split_text-c8734cc0-6890-484f-a003-39d579e19596 (0.000337)
└── CompactAndRefine.get_response-93a55dcd-8e32-4fd7-a644-154a1ebc527b (10.118508)
├── TokenTextSplitter.split_text-cae71b55-aa34-4fc8-8a02-cccc5eab9638 (0.000139)
└── DefaultRefineProgram.__call__-ce0f9cfe-6e40-444d-9967-da475c309950 (10.117663)
└── OpenAI.predict-aab6dbbb-913c-4941-985b-bebc1d93c785 (10.117283)
└── OpenAI.chat-c442fc0e-b9b0-4720-90ad-2f16d215dc45 (10.116348)
traces = langfuse.api.trace.list(limit=1)
trace_data = traces.data[0]
print(f"trace_id: {trace_data.id}")
print(f"trace input: {trace_data.input}")
trace_id: 9d32f7bfd20563c0a66f030408ad0969
trace input: 介紹一下 LlamaIndex 的 instrumentation module?
這邊的trace_data會是 TraceWithDetails
,還只有包含整個呼叫的input 跟 output
TraceWithFullDetails = langfuse.api.trace.get(trace_id)
observations = trace.observations
import pandas as pd
def pydantic_list_to_dataframe(pydantic_list):
"""
Convert a list of pydantic objects to a pandas dataframe.
"""
data = []
for item in pydantic_list:
data.append(item.dict())
return pd.DataFrame(data)
df = pydantic_list_to_dataframe(observations)
df.columns
Index(['id', 'traceId', 'type', 'name', 'startTime', 'endTime',
'modelParameters', 'input', 'metadata', 'output', 'usage', 'level',
'usageDetails', 'costDetails', 'environment', 'inputPrice',
'outputPrice', 'totalPrice', 'calculatedTotalCost', 'latency',
'createdAt', 'unit', 'updatedAt', 'projectId', 'totalTokens',
'promptTokens', 'completionTokens', 'completionStartTime', 'model',
'version', 'statusMessage', 'parentObservationId', 'promptId',
'promptName', 'promptVersion', 'modelId', 'calculatedInputCost',
'calculatedOutputCost', 'timeToFirstToken'],
dtype='object')
重要的大概就 id, name, input, output, metadata
df['name'].unique()
array(['RetrieverQueryEngine.query', 'RetrieverQueryEngine._query',
'CompactAndRefine.synthesize', 'CompactAndRefine.get_response',
'DefaultRefineProgram.__call__', 'OpenAI.predict', 'OpenAI.chat',
'TokenTextSplitter.split_text',
'OpenAIEmbedding.get_query_embedding',
'VectorIndexRetriever._retrieve', 'VectorIndexRetriever.retrieve',
'OpenAIEmbedding._get_query_embedding'], dtype=object)
到這邊,基本上就是有每一步的 input 和 output 了
我們今天用了 LlamaIndex 的 instrumentation 模組,來把一個類似 Langfuse上的 TracingTree 做出來 -> 所以基本上 TracingTree 顯示的其實就是 span
我們填坑了使用 Langfuse 的 資料下載,重點在於它其實有 TraceWithDetails 和 TraceWithFullDetails,一個拿的到裡面的所有 input/output/metadata,一個只有整個 trace 的 input/output
這個 trace 的 name 或者是 span 的 name ,比如說:OpenAI.predict
就是 llama-index 他裡面有一個 class 是 OpenAI 然後呼叫了 predict 這個method,因為 dispatcher 在填id_的時候是:id_ = f"{actual_class}.{method_name}-{uuid.uuid4()}"
到這邊就湊齊了我們的需求:針對一個 LlamaIndex 包起來的呼叫