Day4 新手村的練習課程- 機器人幼稚園第一堂課，你是人是馬?

2022 iThome 鐵人賽

DAY 5

自我挑戰組

來創造一個AI角色吧-新手的探尋之路系列第 5 篇

14th鐵人賽

Leonard Lin

團隊團長找我來柬埔寨參加鐵人賽

2022-09-09 22:52:53

940 瀏覽

分享至

今天要舉一個有趣的範例，以電腦生成的圖像來做訓練，然後用現實的圖像來做辨識，如下圖為電腦生成的圖庫中馬和人的圖各4張，我們來練習辨識是人還是馬：

程式如下，由於是做二分法，所以在最後一層的activation改用'sigmoid'，此外在compile的loss設定為'binary_crossentropy'，而optimizer則是建議用RMSprop(learning_rate=0.001)：

import tensorflow as tf

model = tf.keras.models.Sequential([
    # Note the input shape is the desired size of the image 300x300 with 3 bytes color
    # This is the first convolution
    tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(300, 300, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),
    # The second convolution
    tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    # The third convolution
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    # The fourth convolution
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    # The fifth convolution
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    # Flatten the results to feed into a DNN
    tf.keras.layers.Flatten(),
    # 512 neuron hidden layer
    tf.keras.layers.Dense(512, activation='relu'),
    # Only 1 output neuron. It will contain a value from 0-1 where 0 for 1 class ('horses') and 1 for the other ('humans')
    tf.keras.layers.Dense(1, activation='sigmoid')
])
#sigmoid makes the output value  a single scalar between 0 and 1, encoding the probability that the current image is class 1 (as opposed to class 0).

from tensorflow.keras.optimizers import RMSprop

model.compile(loss='binary_crossentropy',
              optimizer=RMSprop(learning_rate=0.001),
              metrics=['accuracy'])

#Train the model with the binary_crossentropy loss because it's a binary classification problem, and the final activation is a sigmoid
#In this case, using the RMSprop optimization algorithm is preferable to stochastic gradient descent (SGD), because RMSprop automates learning-rate tuning for us.
#(Other optimizers, such as Adam and Adagrad, also automatically adapt the learning rate during training, and would work equally well here.)

我們利用tensorflow.keras.preprocessing.image中的ImageDataGenerator將圖庫資料夾中的圖匯入，並調整統一size和除以255做normalize，然後標上分類，轉成訓練資料的格式：

from tensorflow.keras.preprocessing.image import ImageDataGenerator

# All images will be rescaled by 1./255
train_datagen = ImageDataGenerator(rescale=1/255)

# Flow training images in batches of 128 using train_datagen generator
train_generator = train_datagen.flow_from_directory(
        './horse-or-human/',  # This is the source directory for training images
        target_size=(300, 300),  # All images will be resized to 300x300
        batch_size=128,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')
#A DirectoryIterator yielding tuples of (x, y) where x is a numpy array containing a batch of images with shape (batch_size, *target_size, channels) and y is a numpy array of corresponding labels.

接下來就是training了，training完後我們可以去抓幾張圖現實的圖來試試，一樣要調整size和做normalize：

history = model.fit(
      train_generator,
      steps_per_epoch=8,  
      epochs=15,
      verbose=1)

import numpy as np
from tensorflow.keras.preprocessing import image
 
# predicting images
test_image_file = "horse-gac58efbe9_640.jpg"
path = os.path.join('./test_image/', test_image_file)
img = image.load_img(path, target_size=(300, 300))
x = image.img_to_array(img)
x /= 255
x = np.expand_dims(x, axis=0)

images = np.vstack([x])
classes = model.predict(images, batch_size=10)
print(classes[0])
  
if classes[0]>0.5:
  print("This is a human")
else:
  print("This is a horse")

雖然training後的loss已經小於0.1，但我們來看以下幾個例子，這些圖是最後要餵進去已經調整過size和做normalize後的樣子，左邊兩欄是判斷正確的結果，最右邊一欄是判斷錯誤的結果：