Day19 菜鳥的練功課程-趨勢、週期性、自我相關、與雜訊

2022 iThome 鐵人賽

DAY 20

自我挑戰組

來創造一個AI角色吧-新手的探尋之路系列第 20 篇

14th鐵人賽

Leonard Lin

團隊團長找我來柬埔寨參加鐵人賽

2022-09-24 22:49:19

739 瀏覽

分享至

又開始一個新的應用，在day11至day18我們已經完成NLP的課程，接下來是Sequences, Time Series and Prediction

標題的趨勢、週期性、自我相關、與雜訊是時間資料常見的特徵，例如近期美金有上漲的趨勢，期貨有週期性的變化，股市每天甚至每分每秒都有小波動等，而這些在影像或是聲音的處理也是耳熟能像的，那在這些傳統的理論和處理工具外，機器學習有哪些可以導入和特點呢，讓我們來看這個範例，如下圖，這個時間資料具有緩慢上升的趨勢、週期性的變化與雜訊，而我們已0到999為訓練資料，1000以後為驗證資料：

首先我們用基本的平均，取20點一組：

我們可以計算所謂誤差的評判標準，MSE和MAE，MSE就是誤差量取平方後再平均，而MAE是誤差量取絕對值後再平均：

MSE: 53.815857
MAE: 5.424043

那如何機器學習呢？我們同樣需要標籤和資料，這故事就像昨天”我是一隻小小小小鳥”的範例一樣，以每個時間的值為標籤，而這個時間前的一定區間為資料，而這TensorFlow提供了function可以直接處理：

def windowed_dataset(series, window_size, batch_size, shuffle_buffer):
    """Generates dataset windows
    Args:
      series (array of float) - contains the values of the time series
      window_size (int) - the number of time steps to include in the feature
      batch_size (int) - the batch size
      shuffle_buffer(int) - buffer size to use for the shuffle method
    Returns:
      dataset (TF Dataset) - TF Dataset containing time windows
    """
  
    # Generate a TF Dataset from the series values
    dataset = tf.data.Dataset.from_tensor_slices(series)
    
    # Window the data but only take those with the specified size
    dataset = dataset.window(window_size + 1, shift=1, drop_remainder=True)
    
    # Flatten the windows by putting its elements in a single batch
    dataset = dataset.flat_map(lambda window: window.batch(window_size + 1))

    # Create tuples with features and labels 
    dataset = dataset.map(lambda window: (window[:-1], window[-1]))

    # Shuffle the windows
    dataset = dataset.shuffle(shuffle_buffer)
    
    # Create batches of windows
    dataset = dataset.batch(batch_size).prefetch(1)
    
    return dataset

然後我們以20點為一組，然後最簡單的1維的一層dense，由於評判標準是MSE和MAE，所以loss我們設MSE：

# Parameters
window_size = 20
batch_size = 32
shuffle_buffer_size = 1000

# Generate the dataset windows
dataset = windowed_dataset(x_train, window_size, batch_size, shuffle_buffer_size)

# Build the single layer neural network
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(1, input_shape=[window_size])
])

# Set the training parameters
model.compile(loss="mse", optimizer='adam', metrics=['accuracy'])

# Train the model
model.fit(dataset,epochs=200)

訓練200次後loss為42.7771，然後我們就可以能訓練後的模型來預測：

# Initialize a list
forecast = []

# Use the model to predict data points per window size
for time in range(len(series) - window_size):
  forecast.append(model.predict(series[time:time + window_size][np.newaxis]))

# Slice the points that are aligned with the validation set
forecast = forecast[split_time - window_size:]
# Convert to a numpy array and drop single dimensional axes
results = np.array(forecast).squeeze()

MSE: 75.38902
MAE: 6.348177

我們改用另外一個optimizer：

# Set the training parameters
model.compile(loss="mse", optimizer=tf.keras.optimizers.SGD(learning_rate=1e-6, momentum=0.9))

MSE: 47.908802
MAE: 5.064415

MSE和MAE的結果都較低，而且和平均的方式接近。

Day18 菜鳥的練功課程-文字接龍

Day20 菜鳥的練功課程-爬山技巧之等高線與還有多遠

系列文

來創造一個AI角色吧-新手的探尋之路共 30 篇

RSS系列文訂閱系列文

3 人訂閱

完整目錄

熱門推薦

{{ item.channelVendor }} | {{ item.webinarstarted }} |

直播中

尚未有邦友留言

立即登入留言

參賽組數

902 組

團體組數

37 組

累計文章數

19548 篇

完賽人數

530 人

15th鐵人賽 16th鐵人賽 13th鐵人賽 14th鐵人賽 17th鐵人賽 12th鐵人賽 11th鐵人賽鐵人賽 2019鐵人賽 javascript 2018鐵人賽 python 2017鐵人賽 windows php c# linux windows server css react

IT邦幫忙

來創造一個AI角色吧-新手的探尋之路系列 第 20 篇