iT邦幫忙

2023 iThome 鐵人賽

DAY 24
0
AI & Data

MLOps/LLMOps - 從零開始系列 第 24

Day24 - Ray Serve

  • 分享至 

  • xImage
  •  

在 ML/MLOps 的領域中,我們常常需要將訓練好的模型部署到 production 環境中,讓其他人可以透過 API 來使用這些模型。而 Ray Serve 就是一個可以讓我們快速部署 ML model 的 framework。

而從前一篇的內容可以知道,透過 ray remote 來執行程式碼,可以讓我們在多個節點上執行程式碼,並看到效能可以有顯著的提升。而 Ray Serve 的目的就是將 Ray 的分散式運算能力,應用在 ML model 的部署上。

Quick Start

pip install "ray[serve]"

一個簡單的 HTTP server:

import requests
from starlette.requests import Request
from typing import Dict

from ray import serve


# 1: Define a Ray Serve application.
@serve.deployment(route_prefix="/")
class MyModelDeployment:
    def __init__(self, msg: str):
        # Initialize model state: could be very large neural net weights.
        self._msg = msg

    def __call__(self, request: Request) -> Dict:
        return {"result": self._msg}


app = MyModelDeployment.bind(msg="Hello world!")

# 2: Deploy the application locally.
serve.run(app)

# 3: Query the application and print the result.
print(requests.get("http://localhost:8000/").json())
# {'result': 'Hello world!'}

進階用法:FastAPI

FastAPI server 是一個可以讓我們快速建立 API server 的 framework,而 Ray Serve 也可以與 FastAPI 整合,讓我們可以快速建立一個 API server。

import requests
from fastapi import FastAPI
from ray import serve

# 1: Define a FastAPI app and wrap it in a deployment with a route handler.
app = FastAPI()


@serve.deployment(route_prefix="/")
@serve.ingress(app)
class FastAPIDeployment:
    # FastAPI will automatically parse the HTTP request for us.
    @app.get("/hello")
    def say_hello(self, name: str) -> str:
        return f"Hello {name}!"


# 2: Deploy the deployment.
serve.run(FastAPIDeployment.bind())

# 3: Query the deployment and print the result.
print(requests.get("http://localhost:8000/hello", params={"name": "Theodore"}).json())
# "Hello Theodore!"

進階用法:同時 Serve 多個 model deployment

import requests
import starlette
from typing import Dict
from ray import serve
from ray.serve.handle import DeploymentHandle


# 1. Define the models in our composition graph and an ingress that calls them.
@serve.deployment
class Adder:
    def __init__(self, increment: int):
        self.increment = increment

    def add(self, inp: int):
        return self.increment + inp


@serve.deployment
class Combiner:
    def average(self, *inputs) -> float:
        return sum(inputs) / len(inputs)


@serve.deployment
class Ingress:
    def __init__(self, adder1, adder2, combiner):
        self._adder1: DeploymentHandle = adder1.options(use_new_handle_api=True)
        self._adder2: DeploymentHandle = adder2.options(use_new_handle_api=True)
        self._combiner: DeploymentHandle = combiner.options(use_new_handle_api=True)

    async def __call__(self, request: starlette.requests.Request) -> Dict[str, float]:
        input_json = await request.json()
        final_result = await self._combiner.average.remote(
            self._adder1.add.remote(input_json["val"]),
            self._adder2.add.remote(input_json["val"]),
        )
        return {"result": final_result}


# 2. Build the application consisting of the models and ingress.
app = Ingress.bind(Adder.bind(increment=1), Adder.bind(increment=2), Combiner.bind())
serve.run(app)

# 3: Query the application and print the result.
print(requests.post("http://localhost:8000/", json={"val": 100.0}).json())
# {"result": 101.5}

結論

以上範例可綜合使用,例如透過 FastAPI framework 的方式做為 API server,並且透過 Ray Serve 的方式來部署 ML model。另外,也可以 Serve 不同的 model,是不是很方便呢?

Reference:


上一篇
Day23 - 使用一個簡單範例說明 Run Local and Run with Ray Remote
下一篇
Day25 - Ray Tune
系列文
MLOps/LLMOps - 從零開始30
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言