AI Agent：基於 Semantic Kernel 的辦公室 AI Agent 應用

2024 iThome 鐵人賽

DAY 18

生成式 AI

Semantic Kernel 的魔力-用.NET探索生成式應用系列第 18 篇

16th鐵人賽 ai semantic kernel 生成式ai

Ian

2024-10-01 23:09:51

297 瀏覽

分享至

前一篇文章，我試圖勾勒我對 AI Agent 的理解，本篇內容讓我用 Semantic Kernel 來實作一個基本款的辦公室 AI Agent 應用，展示 AI Agent 的能力。

在開始之前，讓我再一次重覆一下 agent 的組成元素，如圖所示，需要 LLM 模型、Plugin Function(也就是 Tools)以及Memory記憶。

AI Agent 的基本功能

這個 AI Agent 被設計為一個辦公助手，能夠協助以下關鍵任務：

客戶資料查詢：AI Agent 可以根據使用者的請求查詢客戶基本資訊。
發送電子郵件：當需要聯絡客戶時，AI Agent 能夠自動撰寫並發送電子郵件，節省手動處理的時間。

程式碼解說

首先專案需要以下套件

Microsoft.SemanticKernel
Microsoft.SemanticKernel.Agents.Core
Microsoft.SemanticKernel.Plugins.Core

建立 kernel 物件

var builder = Kernel.CreateBuilder()
            .AddAzureOpenAIChatCompletion(
                endpoint: Config.aoai_endpoint,
                deploymentName: Config.aoai_deployment,
                apiKey: Config.aoai_apiKey);
var kernel = builder.Build();

建立 Plugin Function
模擬辦公室Plugin，具有二個 Function，分別是客戶資料查詢以及發送電子郵件，這裡的程式碼當然是簡化的一個概念模擬，然而實際可以是外接系統以 API 方式進行串接。前面的文章也有談到 Function 的 Description 很重要，它的文本描述使用 SK+LLM 得以理解這個 Function 的作用，因此可以在必要時進行調用。

public class OfficePlugin
{
    [KernelFunction, Description("Get customer information by name")]
    public string GetCustomerInfo([Description("customer name")] string customerName)
    {
        var customerInfo = new Dictionary<string, string>
        {
            { "Name", customerName },
            { "Address", "123 Main St, Anytown, USA" },
            { "Phone", "555-1234" },
            { "Email", "Lee@mail.com" }
        };
        return System.Text.Json.JsonSerializer.Serialize(customerInfo);
    }

    [KernelFunction, Description("send e-mail to customer")]
    public string SendEmail([Description("email address")] string mailTo
    , [Description("email recipient")] string name
    , [Description("email subject")] string subject
    , [Description("email content")] string content)
    {

        return $"email sent to {mailTo} with subject {subject} and content {content} successfully";
    }
}

建立 agent
agent 物件的建立，明確定義該agent的名稱、用途指引，這在 multi agent 的場景是很重要的，雖然在這個範例中，我們實現的是單一個 agent 但建議還是給完整。此外需要 agent 能根據任務請求自主決議是否調用 Plugin Function，因此設置 ToolCallBehavior.AutoInvokeKernelFunctions 參數。而 agent 必須有個 LLM 模型做為大腦進行思考，因此 kernel 物件必須設定。在前一篇文章中提到 agent 的組成元素，這裡已完成 LLM 模型的裝配。

const string newsAgentName = "OfficeAgent";
string newsAgentNameInstructions =
        """
        You are an office agent, specializing in handling office tasks.
        The goal is to manage calendar scheduling, client information inquiries, and email sending based on the user's requests.
        You will focus on these tasks and will not perform any unrelated tasks.
        """;

// Define the agent
ChatCompletionAgent agent = new()
{
    Name = newsAgentName,
    Instructions = newsAgentNameInstructions,
    Kernel = kernel,
    Arguments = new KernelArguments(new OpenAIPromptExecutionSettings() { ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions }),
};

裝配 Plugin
建立 Plugin 物件，並且讓 agent 進行裝配。至此 agent 的組成元素，已完成 LLM 模型 + Tools 的裝配。

KernelPlugin plugin = KernelPluginFactory.CreateFromType<OfficePlugin>();
agent.Kernel.Plugins.Add(plugin);

建立 AgentGroupChat 對話記錄物件
透過 AgentGroupChat 實現對話的記錄，在 agent 的組成元素中，實現了關於 Memory 的部份。

// 建立AgentGroupChat 對話記錄物件
AgentGroupChat chat = new();

// 模擬連續對話過程
await InvokeAgentAsync("Hello");
await InvokeAgentAsync("告訴我Lee的基本資料");
await InvokeAgentAsync("發送郵件給Lee，提醒關於明天下午2點會議的通知");

agent 執行結果

async Task InvokeAgentAsync(string input)
{
    //使用者prompt加入對話記錄
    ChatMessageContent message = new(AuthorRole.User, input);
    chat.AddChatMessage(message);

    await foreach (ChatMessageContent response in chat.InvokeAsync(agent))
    {
        Console.WriteLine($"{response.AuthorName}: {response.Content}");
    }
}

/*
agent 輸出：

OfficeAgent: Hi there! How can I assist you today?
OfficeAgent: 以下是Lee的基本資料：

- 名字: Lee
- 地址: 123 Main St, Anytown, USA
- 电话: 555-1234
- 邮箱: Lee@mail.com

还有什么我可以帮忙的吗？
OfficeAgent: 郵件已經成功發送給Lee，提醒他明天下午2點的會議。

還有其他需要幫忙的嗎？
*/

結語

這個 AI Agent 透過 Semantic Kernel 進行實作，展現了以下能力：

自然語言處理：使用者可以使用自然語言與 AI Agent 互動，不需學習特定的命令語法，這讓 AI Agent 更加直覺易於使用。
多工具整合：透過 Semantic Kernel Plugins 機制，實現整合客戶資料查詢以及電子郵件服務，實現多工具的協同合作，讓 AI Agent 能夠自動調用這些工具來完成具體任務。
靈活的決策邏輯：AI Agent 能夠根據使用者的指示，自主決策調用完成任務所需的工具。

在接下來的文章裡，我將會再示範不同的 AI Agent 設計，包含進階的 multi Agent
。