iT邦幫忙

第 11 屆 iThome 鐵人賽

DAY 27
0
AI & Data

Predicting Inter Bus Arrival Times 系列 第 27

Day 27 sklearn

  • 分享至 

  • xImage
  •  

Scikit-learn (formerly scikits.learn) is a free software machine learning library for the Python programming language.[3] It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.

Generalized Linear Models

Sampple Code: [取自(改自)上方連結]

>>> from sklearn import linear_model     #引入 Library
>>> reg = linear_model.LinearRegression() #建立  linear_model 
>>> reg.fit([[0, 0], [1, 1], [2, 2]], [0, 1, 2]) # 訓練資料 by X,y
...                                       
LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None,
                 normalize=False)
>>> reg.coef_ # 查看係數(weights)
array([0.5, 0.5])

MyCode - 動態模型/最後階段

import pandas as pd
from sklearn import linear_model

from sklearn import preprocessing                # 標準化1
from sklearn.preprocessing import MinMaxScaler   # 標準化2
scaler = MinMaxScaler()                          # 標準化2

import time
today = time.strftime("%Y-%m-%d",time.localtime())
filename = '/home/turningpoint1125/'+today+'.csv'
df = pd.read_csv(filename)
print('資料數量:',len(df))


#df_normalize = preprocessing.scale(df.drop(['Time','GPSTime'],axis='columns'))  # 標準化1
#df_normalize = scaler.fit_transform(df.drop(['Time','GPSTime'],axis='columns')) # 標準化2
#print(df_normalize)

reg = linear_model.LinearRegression()
reg.fit(df.drop(['Time','GPSTime'],axis='columns'),df.Time)
#reg.fit(df_normalize,df.Time)  # 標準化 1 2
#print('R^2:',reg.score(df_normalize,df.Time))

print('R^2:',reg.score(df.drop(['Time','GPSTime'],axis='columns'),df.Time))
print('weight:',reg.coef_ )
print('bias',reg.intercept_ )

import csv
with open('/home/turningpoint1125/daily_log.csv','a',encoding='utf8',newline='') as fd :
    writer = csv.writer(fd)
    writer.writerow([float(reg.coef_[0:1]),float(reg.coef_[1:2]),float(reg.coef_[2:3]),float(reg.coef_[3:4]),float(reg.intercept_ ),float(reg.score(df.drop(['Time','GPSTime'],axis='columns'),df.Time))])
    
import smtplib
from email.mime.text import MIMEText
gmail_user = 'turningpoint1125@gmail.com'
gmail_password = 'XXX' # your gmail password

LatW =  str(reg.coef_[0:1])
LonW =  str(reg.coef_[1:2])
DisW =  str(reg.coef_[2:3])
SpeW =  str(reg.coef_[3:4])
context = 'Lat: '+ LatW + '\n' + 'Lon: '+LonW+'\n'+'Dis: '+DisW+'\n'+'Speed: '+SpeW+'\n'+'Bias: '+str(reg.intercept_)+'\n'+'R^2: '+str(reg.score(df.drop(['Time','GPSTime'],axis='columns'),df.Time))+'\n'
 
msg = MIMEText(context)
msg['Subject'] = 'Good Night!'
msg['From'] = 'turningpoint1125@gmail.com'
msg['To'] = 'turningpoint1125@gmail.com'

server = smtplib.SMTP_SSL('smtp.gmail.com', 465)
server.ehlo()
server.login(gmail_user, gmail_password)
server.send_message(msg)
server.quit()

print('Email sent!')  

訓練: model.fit
預測: model.predict
參數:

  • model.coef_ :: 斜率
  • model.intercept ::截距

評估 : model.score(data_X, data_y) 它可以對 Model 用 R^2 的方式進行評估,輸出準確率
準確率 越接近1 越準
R^2 : https://en.wikipedia.org/wiki/Coefficient_of_determination

Ref :
sklearn 常用属性与功能


上一篇
Day 26 Predicting Inter Bus Arrival Times 相關文獻
下一篇
Day 28 Talk about Data
系列文
Predicting Inter Bus Arrival Times 30
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言