Scikit-learn (formerly scikits.learn) is a free software machine learning
library
for the Python programming language.[3] It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.
Sampple Code: [取自(改自)上方連結]
>>> from sklearn import linear_model #引入 Library
>>> reg = linear_model.LinearRegression() #建立 linear_model
>>> reg.fit([[0, 0], [1, 1], [2, 2]], [0, 1, 2]) # 訓練資料 by X,y
...
LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None,
normalize=False)
>>> reg.coef_ # 查看係數(weights)
array([0.5, 0.5])
MyCode - 動態模型/最後階段
import pandas as pd
from sklearn import linear_model
from sklearn import preprocessing # 標準化1
from sklearn.preprocessing import MinMaxScaler # 標準化2
scaler = MinMaxScaler() # 標準化2
import time
today = time.strftime("%Y-%m-%d",time.localtime())
filename = '/home/turningpoint1125/'+today+'.csv'
df = pd.read_csv(filename)
print('資料數量:',len(df))
#df_normalize = preprocessing.scale(df.drop(['Time','GPSTime'],axis='columns')) # 標準化1
#df_normalize = scaler.fit_transform(df.drop(['Time','GPSTime'],axis='columns')) # 標準化2
#print(df_normalize)
reg = linear_model.LinearRegression()
reg.fit(df.drop(['Time','GPSTime'],axis='columns'),df.Time)
#reg.fit(df_normalize,df.Time) # 標準化 1 2
#print('R^2:',reg.score(df_normalize,df.Time))
print('R^2:',reg.score(df.drop(['Time','GPSTime'],axis='columns'),df.Time))
print('weight:',reg.coef_ )
print('bias',reg.intercept_ )
import csv
with open('/home/turningpoint1125/daily_log.csv','a',encoding='utf8',newline='') as fd :
writer = csv.writer(fd)
writer.writerow([float(reg.coef_[0:1]),float(reg.coef_[1:2]),float(reg.coef_[2:3]),float(reg.coef_[3:4]),float(reg.intercept_ ),float(reg.score(df.drop(['Time','GPSTime'],axis='columns'),df.Time))])
import smtplib
from email.mime.text import MIMEText
gmail_user = 'turningpoint1125@gmail.com'
gmail_password = 'XXX' # your gmail password
LatW = str(reg.coef_[0:1])
LonW = str(reg.coef_[1:2])
DisW = str(reg.coef_[2:3])
SpeW = str(reg.coef_[3:4])
context = 'Lat: '+ LatW + '\n' + 'Lon: '+LonW+'\n'+'Dis: '+DisW+'\n'+'Speed: '+SpeW+'\n'+'Bias: '+str(reg.intercept_)+'\n'+'R^2: '+str(reg.score(df.drop(['Time','GPSTime'],axis='columns'),df.Time))+'\n'
msg = MIMEText(context)
msg['Subject'] = 'Good Night!'
msg['From'] = 'turningpoint1125@gmail.com'
msg['To'] = 'turningpoint1125@gmail.com'
server = smtplib.SMTP_SSL('smtp.gmail.com', 465)
server.ehlo()
server.login(gmail_user, gmail_password)
server.send_message(msg)
server.quit()
print('Email sent!')
訓練: model.fit
預測: model.predict
參數:
評估 : model.score(data_X, data_y) 它可以對 Model 用 R^2 的方式進行評估,輸出準確率
準確率 越接近1 越準
R^2 : https://en.wikipedia.org/wiki/Coefficient_of_determination
Ref :
sklearn 常用属性与功能