Day25 - Ray Tune - iT 邦幫忙::一起幫忙解決難題，拯救 IT 人的一天

2023 iThome 鐵人賽

DAY 25

AI & Data

MLOps/LLMOps - 從零開始系列第 25 篇

Day25 - Ray Tune

15th鐵人賽

jimmyliao

2023-10-10 00:17:14

487 瀏覽

分享至

為何需要 Ray Tune

官方文件列了幾點：

Cutting-Edge Optimization Algorithms
First-class Developer Productivity
Multi-GPU & Distributed Training

其中光是可以使用 Multi-GPU & Distributed Training 就很值得試試了。

安裝

pip install "ray[tune]"

簡單範例

from ray import train, tune


def objective(config):  # ①
    score = config["a"] ** 2 + config["b"]
    return {"score": score}


search_space = {  # ②
    "a": tune.grid_search([0.001, 0.01, 0.1, 1.0]),
    "b": tune.choice([1, 2, 3]),
}

tuner = tune.Tuner(objective, param_space=search_space)  # ③

results = tuner.fit()
print(results.get_best_result(metric="score", mode="min").config)

說明一下：
① 定義一個 objective function，也就是我們要最佳化的目標。
② 定義一個 search space，也就是我們要最佳化的參數範圍。
③ 開始一個 Tune 的 run，並且印出最佳化的結果。

再來一個 Pytorch 的範例：

import torch
from ray import train, tune
from ray.tune.search.optuna import OptunaSearch


def objective(config):  # ①
    train_loader, test_loader = load_data()  # Load some data
    model = ConvNet().to("cpu")  # Create a PyTorch conv net
    optimizer = torch.optim.SGD(  # Tune the optimizer
        model.parameters(), lr=config["lr"], momentum=config["momentum"]
    )

    while True:
        train(model, optimizer, train_loader)  # Train the model
        acc = test(model, test_loader)  # Compute test accuracy
        train.report({"mean_accuracy": acc})  # Report to Tune


search_space = {"lr": tune.loguniform(1e-4, 1e-2), "momentum": tune.uniform(0.1, 0.9)}
algo = OptunaSearch()  # ②

tuner = tune.Tuner(  # ③
    objective,
    tune_config=tune.TuneConfig(
        metric="mean_accuracy",
        mode="max",
        search_alg=algo,
    ),
    run_config=train.RunConfig(
        stop={"training_iteration": 5},
    ),
    param_space=search_space,
)
results = tuner.fit()
print("Best config is:", results.get_best_result().config)

① Wrap Pytoch model 並定義一個 objective function，也就是我們要最佳化的目標。
② 定義一個 search space，也就是我們要最佳化的參數範圍，並且初始化一個 search algorithm。
③ 開始一個 Tune 的 run，5 iterations 並印出最佳化的結果。

接下來在與 Databricks 或是 MLflow 整合時會繼續提到 Ray Tune。

Reference: