Day 14：預先訓練好的模型(Keras Applications)

第 12 屆 iThome 鐵人賽

DAY 14

AI & Data

輕鬆掌握 Keras 及相關應用系列第 14 篇

12th鐵人賽 ai machine learning tensorflow

I code so I am

2020-09-14 20:20:13

15711 瀏覽

分享至

前言

之前我們都是自建模型，事實上 Keras 引進很多預先訓練好的模型(Keras Applications)，他們都是屬於影像辨識的模型，大都是各屆影像辨識大賽(ILSVRC)的冠/亞軍，每個模型層數/參數都很多，均使用超強的伺服器訓練而成的(對我而言)，讓一般人可以直接套用這些模型進行辨識。

套用的方式包括：

模型完全採用：可辨識ImageNet提供1000種事物的辨識。
模型部分採用：只擷取特徵，不作辨識。
模型部分採用，並接上自己的input及辨識層：可辨識1000種事物以外的東西。

我們就來看看以上方式如何實踐。

Keras 提供的的模型

幾個月沒看官網，與三年前比較，Keras 提供的的模型又增加許多個，以下是目前【官網】的表列：

表格欄位說明如下：

Size：檔案大小。
Top-1 Accuracy：一次就預測正確的準確率。
Top-5 Accuracy：五次預測中有一次正確的準確率。
Parameters：模型參數(權重、偏差)的數目。
Depth：模型層數。

模型完全採用

先測一段官網的程式如下，檔案名稱為 14_01_Keras_applications1.ipynb。

from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions
import numpy as np

# 預先訓練好的模型 -- ResNet50
model = ResNet50(weights='imagenet')

# 任意一張圖片，例如大象
img_path = './images/elephant.jpg'
# 載入圖檔，並縮放寬高為 (224, 224) 
img = image.load_img(img_path, target_size=(224, 224))
# 加一維，變成 (1, 224, 224, 3)，最後一維是色彩
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
# 特徵縮放，每個特徵減掉該特徵的平均數
x = preprocess_input(x)

# 預測
preds = model.predict(x)
# decode the results into a list of tuples (class, description, probability)
# 顯示預測前3名的答案
print('Predicted:', decode_predictions(preds, top=3)[0])

執行程式時會下載ResNet50模型，預設會存在使用者目錄(C:\Users<login_user>)下的 .keras/models/ 次目錄。程式執行結果如下，預測前3名的答案為：

Indian_elephant', 0.8198577
African_elephant', 0.117787644
tusker', 0.058297537

上一張是側面照，這次換正面照，預測前3名的答案為：

tusker, 0.79292905
African_elephant, 0.17253475
Indian_elephant', 0.03453612

厲害，連大象的種類都分辨的出來。

部分採用，擷取特徵

也可以採用部分模型，只擷取特徵，程式如下，檔案名稱為 14_02_Keras_applications2.ipynb：

from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.vgg16 import preprocess_input
import numpy as np

# 預先訓練好的模型 -- VGG16, 不含後三層(辨識層)
model = VGG16(weights='imagenet', include_top=False)

# 任意一張圖片，例如大象
img_path = './images/elephant.jpg'
# 載入圖檔，並縮放寬高為 (224, 224) 
img = image.load_img(img_path, target_size=(224, 224))
# 加一維，變成 (1, 224, 224, 3)，最後一維是色彩
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
# 特徵縮放，每個特徵減掉該特徵的平均數
x = preprocess_input(x)

features = model.predict(x)
print(features)

補充說明如下：

model = VGG16(weights='imagenet', include_top=False) 的 include_top=False 表不含後三層(辨識層)。
使用 model.summary() 可以比較 include_top=True 的差別。少三層：一層Flatten、二層 Dense。
最後得到的特徵維度為 (1, 7, 7, 512)，因為最後一層的 Feature Map 寬高為 (7, 7)，output 為 512 個神經元。

使用特徵向量比較相似性

進一步使用上述得到的特徵向量，比較各圖檔的相似性，程序如下：

from os import listdir
from os.path import isfile, join

# 讀取目錄下所有圖檔
img_path = './images/'
image_files = np.array([f for f in listdir(img_path) if isfile(join(img_path, f)) and f[-3:] == 'jpg'])
image_files

每個圖檔經過預測，取得特徵向量。

import numpy as np

X = np.array([])
# 合併每個圖檔的像素
for f in image_files:
    image_file = join(img_path, f)
    # 載入圖檔，並縮放寬高為 (224, 224) 
    img = image.load_img(image_file, target_size=(224, 224))
    # 加一維，變成 (1, 224, 224, 3)，最後一維是色彩
    img2 = image.img_to_array(img)
    img2 = np.expand_dims(img2, axis=0)
    if len(X.shape) == 1:
        X = img2
    else:
        X = np.concatenate((X, img2), axis=0)

X = preprocess_input(X)

# 預測
features = model.predict(X)

features.shape, X.shape

使用 cosine_similarity 比較特徵向量。

# 使用 cosine_similarity 比較特徵向量
from sklearn.metrics.pairwise import cosine_similarity

features2 = features.reshape((features.shape[0], -1))

# 比較對象：Tiger3.jpg
no=-2
print(image_files[no])
similar_list = cosine_similarity(features2[no:no+1], features2[:no], dense_output=False)
image_files[:no][np.argsort(similar_list[0])[::-1]]

Tiger3.jpg 與其他圖檔的特徵向量比較，依相似度高低排列如下，結果正確。
'Tiger.jpg', 'Tiger2.jpg', 'style.jpg', 'elephant.jpg', 'elephant2.jpg', 'input.jpg', 'bird01.jpg'
再比另一個圖檔 elephant.jpg，依相似度高低排列如下，結果也正確。
'elephant2.jpg', 'Tiger2.jpg', 'Tiger.jpg', 'Tiger3.jpg', 'style.jpg', '太陽花.jpg', 'input.jpg'