iT邦幫忙

2024 iThome 鐵人賽

DAY 11
0
生成式 AI

Gemini 多模態大型語言模型大小事系列 第 11

Gemini 多模態大型語言模型大小事 Day11 - 使用 Gemini API 產生結構化輸出

  • 分享至 

  • xImage
  •  

前言

    程式環境都會用colab 來執行程式,如果要在其他環境執行,請自行修改哦

colab 事前準備:設定專案和 API 金鑰
載入gemini

#pip install -q -U google-generativeai
import google.generativeai as genai

API 金鑰

from google.colab import userdata
API_KEY=userdata.get('GOOGLE_API_KEY')

#genai.configure(api_key="YOUR_API_KEY")

# Configure the client library by providing your API key.
genai.configure(api_key=API_KEY)

使用 Gemini API 產生結構化輸出

Gemini 預設產生非結構化文本,但某些應用程式需要結構化文字。對於這些用例,您可以限制 Gemini 使用 JSON 回應,JSON 是一種適合自動處理的結構化資料格式。您也可以限制模型使用列舉中指定的選項之一進行回應。

以下是一些可能需要模型結構化輸出的用例:

透過從報紙文章中提取公司資訊來建立公司資料庫。
從簡歷中提取標準化資訊。
從食譜中提取成分,並顯示每種成分的雜貨網站連結。
在提示中,您可以要求 Gemini 產生 JSON 格式的輸出,但請注意,該模型不保證會產生 JSON,並且只產生 JSON。為了獲得更具確定性的回應,您可以在 responseSchema 欄位中傳遞特定的 JSON 模式,以便 Gemini 始終以預期的結構進行回應。

如何使用 generateContent您選擇的 SDK 或直接使用 REST API 的方法來產生 JSON。 顯示了純文字輸入,儘管 Gemini 還可以對包含圖像、 視訊和音訊的多模式請求產生 JSON 回應。

在提示中提供模式作為文本

model = genai.GenerativeModel("gemini-1.5-pro-latest")
prompt = """以 JSON 格式列出一些好吃的中餐食譜。

使用此 JSON 架構:

Recipe = {'recipe_name': str, 'ingredients': list[str]}
Return: list[Recipe]"""
result = model.generate_content(prompt)
print(result)

回答

response:
GenerateContentResponse(
    done=True,
    iterator=None,
    result=protos.GenerateContentResponse({
      "candidates": [
        {
          "content": {
            "parts": [
              {
                "text": "```json\n[\n  {\n    \"recipe_name\": \"\u5bae\u4fdd\u96de\u4e01\",\n    \"ingredients\": [\n      \"\u96de\u8089\u4e01\",\n      \"\u82b1\u751f\",\n      \"\u4e7e\u8fa3\u6912\",\n      \"\u82b1\u6912\",\n      \"\u8591\",\n      \"\u849c\",\n      \"\u8525\",\n      \"\u91ac\u6cb9\",\n      \"\u918b\",\n      \"\u7cd6\",\n      \"\u592a\u767d\u7c89\"\n    ]\n  },\n  {\n    \"recipe_name\": \"\u9ebb\u5a46\u8c46\u8150\",\n    \"ingredients\": [\n      \"\u5ae9\u8c46\u8150\",\n      \"\u8c6c\u7d5e\u8089\",\n      \"\u8c46\u74e3\u91ac\",\n      \"\u8fa3\u6912\u91ac\",\n      \"\u82b1\u6912\u7c89\",\n      \"\u8591\",\n      \"\u849c\",\n      \"\u8525\",\n      \"\u91ac\u6cb9\",\n      \"\u7cd6\",\n      \"\u592a\u767d\u7c89\"\n    ]\n  },\n  {\n    \"recipe_name\": \"\u7cd6\u918b\u6392\u9aa8\",\n    \"ingredients\": [\n      \"\u8c6c\u5c0f\u6392\",\n      \"\u91ac\u6cb9\",\n      \"\u918b\",\n      \"\u7cd6\",\n      \"\u8591\",\n      \"\u849c\",\n      \"\u8525\",\n      \"\u592a\u767d\u7c89\"\n    ]\n  },\n  {\n    \"recipe_name\": \"\u9b5a\u9999\u8304\u5b50\",\n    \"ingredients\": [\n      \"\u8304\u5b50\",\n      \"\u8c6c\u7d5e\u8089\",\n      \"\u8c46\u74e3\u91ac\",\n      \"\u8fa3\u6912\u91ac\",\n      \"\u8591\",\n      \"\u849c\",\n      \"\u8525\",\n      \"\u91ac\u6cb9\",\n      \"\u918b\",\n      \"\u7cd6\"\n    ]\n  },\n  {\n    \"recipe_name\": \"\u9752\u6912\u8089\u7d72\",\n    \"ingredients\": [\n      \"\u8c6c\u8089\u7d72\",\n      \"\u9752\u6912\",\n      \"\u8591\",\n      \"\u849c\",\n      \"\u91ac\u6cb9\",\n      \"\u592a\u767d\u7c89\"\n    ]\n  }\n]\n```"
              }
            ],
            "role": "model"
          },
          .............
)

透過模型配置提供模式

import typing_extensions as typing

class Recipe(typing.TypedDict):
    recipe_name: str
    ingredients: list[str]

model = genai.GenerativeModel("gemini-1.5-pro-latest")
result = model.generate_content(
    "列出一些好吃的中餐食譜。",
    generation_config=genai.GenerationConfig(
        response_mime_type="application/json", response_schema=list[Recipe]
    ),
)
print(result)

回答

......
response:
GenerateContentResponse(
.........

"parts": [
              {
                "text": "[{\"ingredients\": [\"\u732a\u8089\", \"\u767d\u83dc\", \"\u8c46\u8150\", \"\u7c89\u6761\", \"\u6728\u8033\", \"\u8471\", \"\u59dc\", \"\u849c\", \"\u9171\u6cb9\", \"\u6599\u9152\", \"\u76d0\", \"\u7cd6\"], \"recipe_name\": \"\u767d\u83dc\u7096\u7c89\u6761\"}, {\"ingredients\": [\"\u9e21\u8089\", \"\u9752\u6912\", \"\u7ea2\u6912\", \"\u8471\", \"\u59dc\", \"\u849c\", \"\u8fa3\u6912\", \"\u82b1\u6912\", \"\u9171\u6cb9\", \"\u6599\u9152\", \"\u76d0\", \"\u7cd6\"], \"recipe_name\": \"\u5bab\u4fdd\u9e21\u4e01\"}, {\"ingredients\": [\"\u571f\u8c46\", \"\u9752\u6912\", \"\u7ea2\u6912\", \"\u8471\", \"\u59dc\", \"\u849c\", \"\u9171\u6cb9\", \"\u918b\", \"\u76d0\", \"\u7cd6\"], \"recipe_name\": \"\u571f\u8c46\u4e1d\"}, {\"ingredients\": [\"\u756a\u8304\", \"\u9e21\u86cb\", \"\u8471\", \"\u59dc\", \"\u849c\", \"\u76d0\", \"\u7cd6\"], \"recipe_name\": \"\u756a\u8304\u7092\u86cb\"}, {\"ingredients\": [\"\u8304\u5b50\", \"\u8089\u672b\", \"\u8471\", \"\u59dc\", \"\u849c\", \"\u8c46\u74e3\u9171\", \"\u9171\u6cb9\", \"\u6599\u9152\", \"\u76d0\", \"\u7cd6\"], \"recipe_name\": \"\u9c7c\u9999\u8304\u5b50\"}] "
              }
            ],
            "role": "model"
.......

定義模式最簡單的方法是使用類型提示註解。

generation_config={"response_mime_type": "application/json",
                   "response_schema": list[Recipe]}
print(result)                   

回答

response:
GenerateContentResponse(
    done=True,
    iterator=None,
    result=protos.GenerateContentResponse({
      "candidates": [
        {
          "content": {
            "parts": [
              {
                "text": "```json\n[\n  {\n    \"recipe_name\": \"\u5bae\u4fdd\u96de\u4e01\",\n    \"ingredients\": [\n      \"\u96de\u8089\u4e01\",\n      \"\u82b1\u751f\",\n      \"\u4e7e\u8fa3\u6912\",\n      \"\u82b1\u6912\",\n      \"\u8591\",\n      \"\u849c\",\n      \"\u8525\",\n      \"\u91ac\u6cb9\",\n      \"\u918b\",\n      \"\u7cd6\",\n      \"\u592a\u767d\u7c89\"\n    ]\n  },\n  {\n    \"recipe_name\": \"\u9ebb\u5a46\u8c46\u8150\",\n    \"ingredients\": [\n      \"\u5ae9\u8c46\u8150\",\n      \"\u8c6c\u7d5e\u8089\",\n      \"\u8c46\u74e3\u91ac\",\n      \"\u8fa3\u6912\u91ac\",\n      \"\u82b1\u6912\u7c89\",\n      \"\u8591\",\n      \"\u849c\",\n      \"\u8525\",\n      \"\u91ac\u6cb9\",\n      \"\u7cd6\",\n      \"\u592a\u767d\u7c89\"\n    ]\n  },\n  {\n    \"recipe_name\": \"\u7cd6\u918b\u6392\u9aa8\",\n    \"ingredients\": [\n      \"\u8c6c\u5c0f\u6392\",\n      \"\u91ac\u6cb9\",\n      \"\u918b\",\n      \"\u7cd6\",\n      \"\u8591\",\n      \"\u849c\",\n      \"\u8525\",\n      \"\u592a\u767d\u7c89\"\n    ]\n  },\n  {\n    \"recipe_name\": \"\u9b5a\u9999\u8304\u5b50\",\n    \"ingredients\": [\n      \"\u8304\u5b50\",\n      \"\u8c6c\u7d5e\u8089\",\n      \"\u8c46\u74e3\u91ac\",\n      \"\u8fa3\u6912\u91ac\",\n      \"\u8591\",\n      \"\u849c\",\n      \"\u8525\",\n      \"\u91ac\u6cb9\",\n      \"\u918b\",\n      \"\u7cd6\"\n    ]\n  },\n  {\n    \"recipe_name\": \"\u9752\u6912\u8089\u7d72\",\n    \"ingredients\": [\n      \"\u8c6c\u8089\u7d72\",\n      \"\u9752\u6912\",\n      \"\u8591\",\n      \"\u849c\",\n      \"\u91ac\u6cb9\",\n      \"\u592a\u767d\u7c89\"\n    ]\n  }\n]\n```"
              }
            ],
            "role": "model"
          },
          
          ..............

使用列舉來限制輸出

import google.generativeai as genai
import enum
from typing import TypedDict

from typing_extensions import TypedDict # Use typing_extensions.TypedDict


# Define a TypedDict for the schema
class ChoiceSchema(TypedDict):
    value: str 

# Define your enum class
class Choice(enum.Enum):
    PERCUSSION = "打擊樂"
    STRING = "字串"
    WOODWIND = "木管樂器"
    BRASS = "黃銅"
    KEYBOARD = "鍵盤"

model = genai.GenerativeModel("gemini-1.5-pro-latest")

organ = genai.upload_file(path="organ.jpg",
                            display_name="organ")
result = model.generate_content(
    ["這是什麼樂器:",organ],
    generation_config=genai.GenerationConfig(
        response_mime_type="text/x.enum", response_schema=ChoiceSchema
    ),
)


print(result)  # Keyboard

回答

response:
GenerateContentResponse(
    done=True,
    iterator=None,
    result=protos.GenerateContentResponse({
      "candidates": [
        {
          "content": {
            "parts": [
              {
                "text": "\u9019\u4ef6\u6a02\u5668\u662f\u4e00\u53f0**\u7ba1\u98a8\u7434**\u3002"
              }
            ],
            "role": "model"
          },
................

model = genai.GenerativeModel("gemini-1.5-pro-latest")

organ = genai.upload_file("organ.jpg")
result = model.generate_content(
    ["請用繁體中文回答這是什麼樂器:", organ],
    generation_config=genai.GenerationConfig(
        response_mime_type="text/x.enum",
        response_schema={
            "type": "STRING",
            "enum": ["打擊樂", "字串", "木管樂器", "黃銅", "鍵盤"],
        },
    ),
)
print(result)  # Keyboard

回答

response:
GenerateContentResponse(
    done=True,
    iterator=None,
    result=protos.GenerateContentResponse({
      "candidates": [
        {
          "content": {
            "parts": [
              {
                "text": " \u9375\u76e4"
              }
            ],
            "role": "model"
          },
          
          ............

向模型詢問食譜標題列表,並使用列舉Grade為每個標題提供評份等級

import enum
from typing import List, TypedDict
from typing_extensions import TypedDict
from pydantic import BaseModel, Field

class Grade(enum.Enum):
    A_PLUS = "a+"
    A = "a"
    B = "b"
    C = "c"
    D = "d"
    F = "f"

class Recipe(TypedDict):
    grade: Grade
    recipe_name: str
    
model = genai.GenerativeModel("gemini-1.5-pro-latest")

result = model.generate_content(
    "列出大約 10 種中餐食譜,根據受歡迎程度對它們進行評分",
    generation_config=genai.GenerationConfig(
        response_mime_type="application/json",
        response_schema={
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "grade": {"type": "string", "enum": [e.value for e in Grade]},
                    "recipe_name": {"type": "string"},
                },
            },
        }
    ),
    request_options={"timeout": 600},
)
print(result)  # [{"grade": "a+", "recipe_name": "Mapo Tofu"}, ...]

回答

response:
GenerateContentResponse(
    done=True,
    iterator=None,
    result=protos.GenerateContentResponse({
      "candidates": [
        {
          "content": {
            "parts": [
              {
                "text": "[{\"grade\": \"a+\", \"recipe_name\": \"Kung Pao Chicken\"}, {\"grade\": \"a+\", \"recipe_name\": \"Mapo Tofu\"}, {\"grade\": \"a\", \"recipe_name\": \"Chow Mein\"}, {\"grade\": \"a\", \"recipe_name\": \"Sweet and Sour Pork\"}, {\"grade\": \"a\", \"recipe_name\": \"Dumplings\"}, {\"grade\": \"a\", \"recipe_name\": \"Spring Rolls\"}, {\"grade\": \"b\", \"recipe_name\": \"Peking Duck\"}, {\"grade\": \"b\", \"recipe_name\": \"Wontons\"}, {\"grade\": \"b\", \"recipe_name\": \"Fried Rice\"}, {\"grade\": \"b\", \"recipe_name\": \"Lo Mein\"}] "
              }
            ],
            "role": "model"
          },
          ...........

上一篇
Gemini 多模態大型語言模型大小事 Day10 - 程式碼執行
下一篇
Gemini 多模態大型語言模型大小事 Day12 - Gemini API 函式呼叫簡介
系列文
Gemini 多模態大型語言模型大小事30
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言