剛好學校的機器學習報告需要自己訓練這一兩天弄一弄
利用上次學習到的觀念
還有之前對於pandas操作 將數據整理好
#加載數據
data = pd.read_csv('sleep and psychological effects.csv')
#對label進行編碼
label_encoder = LabelEncoder()
data['Mood_Impact'] = label_encoder.fit_transform(data['Mood_Impact'])
#對特徵值轉成One-Hot編碼
categorical_columns = ['Gender', 'Favorite_Book_Genre']
data = pd.get_dummies(data, columns=categorical_columns, drop_first=True)
#定義特徵和目標變量
X = data.drop(columns=['User_ID', 'Mood_Impact'])
y = data['Mood_Impact']
#將數據分為測試集和訓練集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1)
今天先把針對數據的地方做完:)