[Day 09] 從 tensorflow.keras 開始的 VGG Net 生活 (第二季)

13th鐵人賽 vgg tensorflow keras python

佑佑來了

2021-09-24 00:19:02

5460 瀏覽

分享至

2. VGG 實作(tensorflow)

2.1 南無觀世"import"啥？

import itertools
from sklearn.metrics import confusion_matrix, classification_report
from tensorflow.keras.applications import vgg16
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.utils import to_categorical
import tensorflow as tf
import pandas as pd
import numpy as np
import os
import matplotlib.pyplot as plt

2.2 自定義函數

讀取fer2013.csv用的

def prepare_data(data):
    image_array = np.zeros(shape=(len(data), 48, 48, 1))
    image_label = np.array(list(map(int, data['emotion'])))

    for i, row in enumerate(data.index):
        image = np.fromstring(data.loc[row, 'pixels'], dtype=int, sep=' ')
        image = np.reshape(image, (48, 48, 1))  # 灰階圖的channel數為1
        image_array[i] = image

    return image_array, image_label

因為VGG預訓練模型的輸入層需要3通道的圖像，所以我直接複製第1通道的陣列到2和3通道上。

def convert_to_3_channels(img_arrays):
    sample_size, nrows, ncols, c = img_arrays.shape
    img_stack_arrays = np.zeros((sample_size, nrows, ncols, 3))
    for _ in range(sample_size):
        img_stack = np.stack(
            [img_arrays[_][:, :, 0], img_arrays[_][:, :, 0], img_arrays[_][:, :, 0]], axis=-1)
        img_stack_arrays[_] = img_stack/255
    return img_stack_arrays

建立預訓練模型，這裡第一次執行時會需要下載一段時間

def build_model(preModel=VGG16, num_classes=7):
    pred_model = preModel(include_top=False, weights='imagenet',
                              input_shape=(48, 48, 3),
                              pooling='max', classifier_activation='softmax')
    output_layer = Dense(
        num_classes, activation="softmax", name="output_layer")

    model = tf.keras.Model(
        pred_model.inputs, output_layer(pred_model.output))

    model.compile(optimizer=tf.keras.optimizers.Adam(),
                  loss=tf.keras.losses.CategoricalCrossentropy(), metrics=['accuracy'])

    return model

2.3 資料讀取與切分訓練、驗證集

這邊要注意的是，我們需要使用to_categorical()將標籤(y)轉成獨熱編碼(One-hot encoding)。
才能夠做分類任務的訓練

df_raw = pd.read_csv("D:/mycodes/AIFER/data/fer2013.csv")
# 資料切割(訓練、驗證、測試)
df_train = df_raw[df_raw['Usage'] == 'Training']
df_val = df_raw[df_raw['Usage'] == 'PublicTest']

X_train, y_train = prepare_data(df_train)
X_val, y_val = prepare_data(df_val)

X_train = convert_to_3_channels(X_train)
X_val = convert_to_3_channels(X_val)

y_train_oh = to_categorical(y_train)
y_val_oh = to_categorical(y_val)

表情範例(angry類別)

表情範例

2.4 測試模型輸入與輸出是否符合資料格式

如果這邊有錯誤回報，那就要回去檢查prepare_data()是否有bug。
這裡我們輸入第一筆資料進去，
輸出維度(1,7)代表有1筆資料，該資料長度為7，
範例: np.array([0.1, 0.2, 0.5, 0.0, 0.1, 0.0, 0.0])
其中每個位置的數字代表此資料屬於對應的表情類別的機率。

model_vgg16 = build_model()
prob_vgg16 = model_vgg16(X_train[:1]).numpy()
print(prob_vgg16.shape)

#Output: (1, 7)

2.5 開始訓練

因為接下來要對不同模型架構(resnet和mobilenet等等)做大量的實驗，
所以我統一制定epochs = 30, batch_size=32，以求公平性的比較。

epochs = 30
batch_size = 32

hist1 = model_vgg16.fit(X_train, y_train_oh, validation_data=(X_val, y_val_oh),
                        epochs=epochs, batch_size=batch_size)