[Day 16] 模型正則化方法 (2)：Dropout

2024 iThome 鐵人賽

DAY 16

AI/ ML & Data

輕鬆上手AI專案－影像分類到部署模型系列第 16 篇

16th鐵人賽 python 影像分類深度學習

Eunice

2024-09-29 02:32:20

1475 瀏覽

分享至

前言

昨天介紹了避免 Overfitting 的其中一種方法，為權重正則化，今天要來介紹另一種方法，為丟棄法（Dropout），是一種很常使用也很有效的模型正則化方法，適合用於大模型上。

Dropout

Dropout 顧名思義，表示會丟棄一些東西，會在訓練模型期間隨機丟棄一些輸出的特徵值，即將它們設為零，例如有一層的輸出值為 [0.5, 0.4, 1.9, 0.6, 2.1]，使用 Dropout 可以讓它們變成 [0, 0.4, 0, 0.6, 2.1]，隨機「丟棄」兩個數字，即把這兩個數字設為零了，而這裡的 Dropout Rate 為 0.4（5 個數值中有 2 個數值被丟棄），Dropout Rate 通常設定的範圍為 0.2～0.5 之間。

使用方法

使用 layers.Dropout()，括號內填入 Dropout Rate，並加在想要做 Dropout 的輸出層的下一層。

例如將 Dropout 加入 VGG16 的全連接層後面，並將 Dropout Rate 設為 0.3。

匯入 Dropout 方法：

from tensorflow.keras.layers import Dropout

函數式 API 寫法：

...（略）
x = layers.Dense(4096, activation="relu")(x)
x = layers.Dropout(0.3)(x)
x = layers.Dense(4096, activation="relu")(x)
x = layers.Dropout(0.3)(x)
outputs = layers.Dense(5, activation="softmax")(x)
...（略）

序列式模型寫法：

...（略）
model.add(Dense(units=4096,activation="relu"))
model.add(layers.Dropout(0.3))
model.add(Dense(units=4096,activation="relu"))
model.add(layers.Dropout(0.3))
model.add(Dense(units=2, activation="softmax"))
...（略）

另一種序列式模型寫法：

model = Sequential([
    ...（略）
    Dense(units=4096,activation="relu"),
    Dropout(0.3),
    Dense(units=4096,activation="relu"),
    Dropout(0.3),
    Dense(units=2, activation="softmax")
])

使用 model.summary() 會看到多了 Dropout 層：

...（略）
flatten_2 (Flatten)          (None, 32768)             0         
_________________________________________________________________
dense_4 (Dense)              (None, 4096)              134221824 
_________________________________________________________________
dropout_2 (Dropout)          (None, 4096)              0         
_________________________________________________________________
dense_5 (Dense)              (None, 4096)              16781312  
_________________________________________________________________
dropout_3 (Dropout)          (None, 4096)              0         
_________________________________________________________________
dense_6 (Dense)              (None, 2)                 8194      
=================================================================
Total params: 165,726,018
Trainable params: 165,726,018
Non-trainable params: 0
_________________________________________________________________

使用 Dropout 並不會改變訓練參數的數量，只會影響訓練模型的過程中神經元的啟動（Activation），也不會改變模型的結構。

明天會開始介紹 Callbacks 模組，可以用來監控訓練模型的過程，十分方便好用