邏輯回歸 Logistic Regression可以用來解決分類問題,像是糖尿病的原因,如果有年齡、血糖、體重等資料,可以用來做回歸預測。
可以透過Sigmoid Function(S型函數)來讓回歸的直線更加貼合數據。
import pandas as pd
url = "https://raw.githubusercontent.com/GrandmaCan/ML/main/Classification/Diabetes_Data.csv"
data = pd.read_csv(url)
data
data["Gender"] = data["Gender"].map({"男生": 1, "女生": 0})
data
from sklearn.model_selection import train_test_split
x = data[["Age", "Weight", "BloodSugar", "Gender"]]
y = data["Diabetes"]
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=87)
x_train = x_train.to_numpy()
x_test = x_test.to_numpy()
資料標準化
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler.fit(x_train)
x_train = scaler.transform(x_train)
x_test = scaler.transform(x_test)
import numpy as np
def sigmoid(z):
return 1/(1+np.exp(-z))
w = np.array([1, 2, 3, 4])
b = 1
z = (w*x_train).sum(axis=1) + b
sigmoid(z)