【Day 14】我要 Chain 好 Chain 滿！

2024 iThome 鐵人賽

DAY 14

生成式 AI

T 大使 AI 之旅系列第 14 篇

16th鐵人賽

Sean

2024-08-18 23:57:53

918 瀏覽

分享至

前情提要

上一篇文章了解了 LangChain 基本的 Chain。那我想要 Chain 更多的東西，或者是自己客製化的東西然後跟 LLM Chain 在一起，又或者是 LangChain 還有什麼功能可以 Chain 進來，接下來就來一一實作！

LCEL (LangChain Expression Language) 如何運作

為了真正理解 LCEL，要先來了解 LCEL 是如何運作的。首先我們要在 class 中建立一個方法是 __or__，那當我們在兩個函數間呼叫運算子 "|" 的話，如：chain = class_a | class_b。那麼運算子 "|" 就會被轉換為 chain = class_a.__or__(class_b)。這樣講有點抽象，來試著用一些簡單的函數來創建我們自己的版本。(基礎版與進階版是對我而言是這樣，不一定適用所有人～)

基礎版：

# 1.建立 Runnable 物件
class Runnable:
	def __init__(self, func):
		self.func = func
	def __or__(self, other):
		def chained_func(*args, **kwargs):
			return other(self.func(*args, **kwargs))
		return Runnable(chained_func)
	def __call__(self, *args, **kwargs):
		return self.func(*args, **kwargs)
# 2.建立兩個簡單的函數
def add_five(x):
	return x + 5
def multiply_by_two(x):
	return x * 2
# 3.將函數轉換成 Runnable 物件
runnable_add_five = Runnable(add_five)
runnable_multiply_by_two = Runnable(multiply_by_two)
# 4.使用 __or__ 連接 Runnable 物件
chain_object = runnable_add_five.__or__(runnable_multiply_by_two)
print(chain_object(3)) # 16
# 5.使用 "|" 連接 Runnable 物件
chain_pipe = runnable_add_five | runnable_multiply_by_two
print(chain_pipe(3)) # 16

程式碼結果探討 🧐：

Runnable 物件：
- __init__ 是可以將函數轉換成 Runnable 物件 (第三部分)
- __or__ 當我們在程式碼中使用運算子 "|" 會來調用 __or__
- __call__ 是允許 Runnable 可以像函數一樣被調用，沒設置這個的話 __or__ 沒辦法調用 (第四部分)
建立兩個簡單的函數分別是 +5 和 *2
將我們建立的函數變成 Runnable 物件
可以看到不管我們使用 __or__ 或者是運算子 "|" 的結果是一模一樣的

進階版 (物件導向)：

from abc import ABC, abstractmethod

class CRunnable(ABC):
	def __init__(self):
		self.next = None
	@abstractmethod
	def process(self, data):
		"""
		This method must be implemented by subclasses to define
		data processing behavior.
		"""
		pass
	def invoke(self, data):
		processed_data = self.process(data)
		if self.next is not None:
			return self.next.invoke(processed_data)
		return processed_data
	def __or__(self, other):
		return CRunnableSequence(self, other)

class CRunnableSequence(CRunnable):
	def __init__(self, first, second):
		super().__init__()
		self.first = first
		self.second = second
	def process(self, data):
		return data
	def invoke(self, data):
		first_result = self.first.invoke(data)
		return self.second.invoke(first_result)

class AddTen(CRunnable):
	def process(self, data):
		result = data + 10
		print("AddTen: ", result)
		return result 
		
class MultiplyByTwo(CRunnable):
	def process(self, data):
		result = data * 2
		print("Multiply by 2: ", result)
		return result

class MinusSix(CRunnable):
	def process(self, data):
		result = data - 6
		print("MinusSix: ", result)
		return f"Result: {result}"

a = AddTen()
b = MultiplyByTwo()
c = MinusSix()

chain = a | b | c
result = chain.invoke(10)
print(result)

# AddTen: 20
# Multiply by 2: 40
# MinusSix: 34
# Result: 34

結果探討 🧐：

這個也是 LCEL 的運作邏輯，只是這種寫法就我所知是物件導向的寫法，不同於第一個個是將函數轉換成 Runnable 物件，這個是由物件去繼承 Runnable 物件。還有很多設計的小細節應該也是物件導向的，雖然我看得懂程式碼，但我本身沒學過物件導向，所以沒辦法解釋程式碼。(如果我有說錯的話再麻煩糾正我)

Chain 的調用方法

前面的實作中，都只有使用到 invoke，但這只是很簡單的一個輸入一個 input，LCEL 還有一些通用的調用方法。

實戰🔥

Batch：可以一次輸入多筆 query

import pprint
from langchain_core.prompts import PromptTemplate
from langchain_ffm import ChatFormosaFoundationModel
llm = ChatFormosaFoundationModel(model="ffm-llama3-70b-chat", temperature=0.01)

template = "{league}在{year}賽季的總冠軍是誰？"
prompt = PromptTemplate(input_variables=["league", "year"], template=template)

# 使用 "|" 將 Prompt 和 Model Chain 起來
chain = prompt | llm

# batch 多筆輸入
pprint.pprint(chain.batch([{"league" : "NBA", "year":"2021"}, {"league" : "MLB", "year":"2020"}]))

程式碼結果探討 🧐：

batch 方式可以一次輸入多筆資料。（請忽略 AI 回答錯誤）

Stream：可以查看 AI 生成的過程

from langchain_core.prompts import PromptTemplate
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o", temperature=0)

template = "{league}在{year}賽季的總冠軍是誰？"
prompt = PromptTemplate(input_variables=["league", "year"], template=template)

# 使用 "|" 將 Prompt 和 Model Chain 起來
chain = prompt | llm

# stream
for stream in chain.stream({"league" : "NBA", "year":"2020"}):
print(stream.content, end="|", flush=True)

程式碼結果探討 🧐：

stream 方式可以看到 AI 的生成過程，但有些模型可能不支援 LangChain 的 stream 模式，像是台智雲。但台智雲自己有提供 stream 方式可以使用 -> 台智雲 stream 模式。

Runnables

了解 LCEL 的運作流程，也知道怎麼自定義我們自己的 Chain。昨天說到 LangChain 中將運算子 "|" 當作他們的 Chain，那 Prompt 和 Model 就是 Runnable 的部分。為了讓自定義鏈的創建變得盡可能簡單，LangChain 透過 Runnable 來將不管是自定義的還是 LangChain 中原生的，在使用 LCEL 時可以最大限度地提高靈活性。

LangChain 中原生較常用到的 Runnable 元素

Prompt
Chat Model
LLM Model
Output Parser
Retriver
Tool

LangChain 中較常用到的 Runnables 函式

RunnablePassThrough：

這個 Runnable 傳遞什麼參數進去就回傳跟 input 一樣的內容，簡單來說就是什麼行為都沒做。

from langchain_core.runnables import RunnablePassthrough

chain = RunnablePassthrough() | RunnablePassthrough() | RunnablePassthrough()
chain.invoke("hello")
# output: hello

程式碼結果探討 🧐：

輸出結果會與輸入內容一致

RunnableLambda：

這個 Runnable 就如同我們第一個實作的 LCEL 的基礎版的部分。

from langchain_core.runnables import RunnableLambda

def add_five(x):
	return x + 5

def multiply_by_two(x):
	return x * 2

chain = RunnableLambda(add_five) | RunnablePassthrough() | RunnableLambda(multiply_by_two)
chain.invoke(3)
# output: 16

程式碼結果探討 🧐：

可以將自定義函數透過 RunnableLambda 的方式放進 LCEL 的流程裡面，也可以與其他的 Runnables 同步使用。

RunnableParallel：

一般的 Runnable 都像 Pipeline 那樣依序執行，就是我昨天提到的 Chain 的順序，這個依序執行也可以稱作 RunnableSequence。那 RuunableParallel 提供平行執行的概念，就是可以將 Runnable 並行的概念，實作一個比較好了解平行執行的概念！我以我昨天 few-shot 與 zero-shot 的作為範例！

先設定共同的部分

from langchain_core.prompts import PromptTemplate, FewShotPromptTemplate
from langchain_ffm import ChatFormosaFoundationModel
llm = ChatFormosaFoundationModel(model="ffm-llama3-70b-chat", temperature=0.01)
# from langchain_openai import ChatOpenAI
# llm = ChatOpenAI(model="gpt-4o", temperature=0)
# from langchain_google_genai import ChatGoogleGenerativeAI
# llm = ChatGoogleGenerativeAI(model="gemini-pro", temperature=0)

# 設定欲修正的詞 -> 現金餘額
text = "現金允額"

# zero-shot 的 prompt
zero_shot_template = """糾正以下文字的錯字 : {input}
修正結果 : """
zero_shot_prompt = PromptTemplate(input_variables=["input"], template=zero_shot_template)
# 使用 "|" 將 Prompt 和 Model Chain 起來
zero_shot_chain = zero_shot_prompt | llm

# few-shot 的 prompt
few_shot_template = """修正以下文字的錯字 : {input}
修正結果 : {answer}"""
few_shot_example_prompt = PromptTemplate(input_variables=["input", "answer"], template=few_shot_template)
# 根據上面 Prompt 的參數設定給 AI 的 few-shot
few_shot_examples = [
	{
		"input": "通貨紅脹",
		"answer": "通貨膨脹"
	},
	{
		"input": "政府有一個獎注的補助費用",
		"answer": "政府有一個獎助的補助費用"
	},
	{
		"input": "庫藏骨",
		"answer": "庫藏股"
	},
]
# 使用 FewShotPromptTemplate 模塊
few_shot_prompt = FewShotPromptTemplate(
	examples=few_shot_examples,
	example_prompt=few_shot_example_prompt,
	suffix="修正以下文字的錯字 : {input}",
	input_variables=["input"],
)
# 使用 "|" 將 Prompt 和 Model Chain 起來
few_shot_chain = few_shot_prompt | llm

一個是像照順序執行，第一個執行完才會執行第二個；另一個是兩個會同時進行，即為平行運算

# 執行完 zero-shot 才會執行 few-shot
zero_shot_response = zero_shot_chain.invoke({"input":text})
print(f"Zero Shot 的結果\n欲修正內容 : {text}\n修正結果 : {zero_shot_response.content}")
few_shot_response = few_shot_chain.invoke({"input":text})
print(f'\nFew Shot 的結果\n欲修正內容 : {text}\n修正結果 : {few_shot_response.content}')
# 平行執行
chain = RunnableParallel({'zero-shot': zero_shot_chain, 'few-shot': few_shot_chain})
chain.invoke(text)

程式碼結果探討 🧐：

從截圖中可以看到兩個都是執行一樣的步驟和任務，但使用 RunnableParallel 幾乎是比另一個快了一倍的時間。
程式碼說明可以參考【Day 13】LangChain 怎麼 Chain？

assign：

如果想要對輸入做一些加工或者對 Runnable 結果做修改，可以使用 assign！

from langchain_core.runnables import RunnablePassthrough
from langchain_core.prompts import PromptTemplate
prompt = PromptTemplate.from_template("{input}")
# 1.針對輸入做加工
chain = RunnablePassthrough.assign(input=lambda x: 'Your name is ' + x['input']) | prompt
print(chain.invoke({"input": "Sean"}))
# output: text='Your name is Sean'

# 2.對 Runnable 結果做修改
def assign_func(input):
	return "Lulu"
chain = RunnableParallel({"x": RunnablePassthrough()}).assign(y=RunnableLambda(assign_func))
print(chain.invoke({"x": "Sean"}))
# output: {'x': {'x': 'Sean'}, 'y': 'Lulu'}

def fun1(input: dict):
	return input.get("y", "Key not found")
def fun2(upper: str):
	return str(upper).upper()

new_chain = RunnableLambda(func1) | RunnableLambda(fun2)
print(new_chain.invoke({"y": "Sean"}))
# output: SEAN
final_chain = chain | new_chain
print(final_chain.invoke({"x": "Sean"}))
# output: LULU

程式碼結果探討 🧐：

1 的部分是針對輸入的內容做加工，可以看到原本輸入都沒有東西，invoke 的內容也只有 Sean，但透過 Assign 就添加了一點輸入內容
2 的部分就是針對結果做一個修改，可以看到我輸入的第一個 chain 是一個字典的形式，透過 assign 加了一筆資料。第二個 new_chain 是兩個函數的 chain，會將抓到的值全部轉成大寫。第三個 final_chain 是 chain 和 new_chain 串接起來，可以看到原本 x 輸出應該是 Sean，接著因為第一個 chain 加了 y 是 Lulu，然後 new chain 抓到 dict 裡面的 key 有 y 值，所以將原本 dict 的內容取代成 y 的 value 值，最後一個 chain 會轉換成全部大寫。

結論

今天了解了 LCEL 是如何運作的，也了解幾個常用 Runnables。LCEL 可以讓實作變得很簡單也很易於擴充一些功能，LCEL 的優點不僅限於今天聊到的部分，有興趣的人可以參考 Advantages of LCEL。明天來實作將 LangChain 原生的 Runnable 與自定義的 Runnable 透過 LCEL 做結合，也會在多使用一些除了 invoke 之外的好用函數。

題外話🤣

今天真的超趕，好險即時趕出來壓線上傳，下次不敢了...

下一篇文章：LCEL 結合自訂 & 原生 Runnable 實戰

【Day 13】LangChain 怎麼 Chain？

【Day 15】LCEL 結合自訂 & 原生 Runnable 實戰

系列文

T 大使 AI 之旅共 30 篇

RSS系列文訂閱系列文

17 人訂閱

完整目錄

熱門推薦

{{ item.channelVendor }} | {{ item.webinarstarted }} |

直播中

尚未有邦友留言

立即登入留言

參賽組數

902 組

團體組數

37 組

累計文章數

19864 篇

完賽人數

529 人

15th鐵人賽 16th鐵人賽 13th鐵人賽 14th鐵人賽 17th鐵人賽 12th鐵人賽 11th鐵人賽鐵人賽 2019鐵人賽 javascript 2018鐵人賽 python 2017鐵人賽 windows php c# linux windows server css react

IT邦幫忙

T 大使 AI 之旅系列 第 14 篇