[第七夜] 人臉偵測 : MTCNN 的實做 Part.1 簡單用 OpenCV 體驗

15th鐵人賽

zivzhong

2023-09-22 23:54:29

795 瀏覽

分享至

前言

歡迎大家繼續回到我們這個系列，那在基本介紹 MTCNN 之後，我接下來就要來介紹要怎麼實做跟使用了！但一般來說接下來可能就會開始接著 Pytorch 的實做等等，你或許從其他地方理解過深度學習了，一看到實做這兩個字就會想說：

1.尬電，要來弄環境了嗎QQ我的電腦現在還不行呀QQQ
2.我可以先體驗一下 MTCNN 這個 AI 效果嗎？讓我直接開始跟著用環境我有點覺得不知道我在做什麼
3.一定只有 Pytorch/Tensorflow 才能體驗 AI 嗎？難道不能用平易近人的 Lib 體驗嗎？
沒事！我能夠理解這種心情，那在開始介紹 Pytorch 實做之前，我們準備了一小段體驗環節給大家體驗看看，而且無須擔心會使用到 Pytorch 等等聽起來就好大的東西，我們來使用 OpenCV 裡的工具體驗看看 MTCNN 的效果吧！
（對！有些人這時已經發現昨晚我們已經有鋪陳用 Opencv 來跑 AI 了!

實做

我們這個環節主要是希望大家能夠體驗 MTCNN 的效果，因此我們今晚會先跳過訓練的環節，單純先從使用開始！
那模型權重我們就先使用網路上別人訓練好的就好，請從這個連結下載整個資料夾，並且確認裡面有 pnet.onnx,rnet.onnx,以及onet.onnx這三個模型的權重！

Lib 安裝

pip install opencv-python
pip install numpy

程式

整個完整程式如下，這份 code 首先加載了訓練好的MTCNN模型（P-Net、R-Net和O-Net），然後載入測試圖像。接著，它調整圖像大小以適應P-Net的輸入大小，然後運行P-Net進行候選框的生成。最後，它使用非最大抑制（NMS）來去除重疊的候選框，並繪製檢測結果。P-Net 結束之候則依序繼續開始 R-Net 跟 O-Net 的部份。

另外請注意替換程式中的模型權重路徑（'path_to_pnet.onnx'等）到您實際保存模型的路徑（ex. /home/user/pnet.onnx），並將'test_image.jpg'替換為您要測試的照片。下面這個程式可以讓您看到MTCNN模型如何在測試照片上檢測到人臉並繪製出結果。

import cv2
import numpy as np

# 加載訓練好的MTCNN模型（P-Net、R-Net和O-Net）
pnet = cv2.dnn.readNetFromONNX('path_to_pnet.onnx')
rnet = cv2.dnn.readNetFromONNX('path_to_rnet.onnx')
onet = cv2.dnn.readNetFromONNX('path_to_onet.onnx')

# 加載要測試照片
image = cv2.imread('test_image.jpg')

# 定義P-Net的輸入大小
pnet_input_size = (12, 12)

# 將圖像調整為P-Net的輸入大小
image_resized = cv2.resize(image, pnet_input_size)

# 在P-Net上運行人臉檢測
blob = cv2.dnn.blobFromImage(image_resized, scalefactor=1, size=pnet_input_size, mean=(127.5, 127.5, 127.5), swapRB=True)
pnet.setInput(blob)
pnet_output = pnet.forward()

# 解析P-Net的輸出以獲取候選框（bounding boxes）和相關分數
probs = pnet_output[2][0]
detections = []

for i in range(probs.shape[0]):
    confidence = probs[i][2]
    if confidence > 0.5:  # 設置閾值以過濾低置信度的候選框
        x1, y1, x2, y2 = pnet_output[0][0][i][:4]
        detections.append([x1, y1, x2, y2, confidence])

# NMS（非最大抑制）以去除重疊的候選框
nms_detections = cv2.dnn.NMSBoxes(detections, probs[:, 2], 0.5, 0.4)

# 繪製檢測結果
for i in nms_detections:
    x1, y1, x2, y2, confidence = detections[i][0:5]
    cv2.rectangle(image, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0), 2)
    text = f'Face: {confidence:.2f}'
    cv2.putText(image, text, (int(x1), int(y1) - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

# 將P-Net生成的候選框裁剪出來，用作R-Net的輸入
rnet_input_size = (24, 24)
rnet_detections = []

for i in nms_detections:
    x1, y1, x2, y2, _ = detections[i][0:5]
    # 裁剪候選框
    face_roi = image[int(y1):int(y2), int(x1):int(x2)]
    # 調整大小以適應R-Net的輸入大小
    face_resized = cv2.resize(face_roi, rnet_input_size)
    rnet_blob = cv2.dnn.blobFromImage(face_resized, scalefactor=1.0, size=rnet_input_size, mean=(127.5, 127.5, 127.5), swapRB=True)
    rnet.setInput(rnet_blob)
    rnet_output = rnet.forward()
    rnet_probs = rnet_output[0]
    # 如果R-Net檢測到人臉，則將候選框添加到列表中
    if rnet_probs[0][1] > 0.5:
        rnet_detections.append([x1, y1, x2, y2, rnet_probs[0][1]])

# 繪製R-Net的檢測結果
for rnet_detection in rnet_detections:
    x1, y1, x2, y2, confidence = rnet_detection
    cv2.rectangle(image, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0), 2)

# 將R-Net生成的候選框裁剪出來，用作O-Net的輸入
onet_input_size = (48, 48)
onet_detections = []

for rnet_detection in rnet_detections:
    x1, y1, x2, y2, _ = rnet_detection
    # 裁剪候選框
    face_roi = image[int(y1):int(y2), int(x1):int(x2)]
    # 調整大小以適應O-Net的輸入大小
    face_resized = cv2.resize(face_roi, onet_input_size)
    onet_blob = cv2.dnn.blobFromImage(face_resized, scalefactor=1.0, size=onet_input_size, mean=(127.5, 127.5, 127.5), swapRB=True)
    onet.setInput(onet_blob)
    onet_output = onet.forward()
    onet_probs = onet_output[0]
    # 如果O-Net檢測到人臉，則將候選框添加到列表中
    if onet_probs[0][1] > 0.5:
        onet_detections.append([x1, y1, x2, y2, onet_probs[0][1]])

# 繪製O-Net的檢測結果
for onet_detection in onet_detections:
    x1, y1, x2, y2, confidence = onet_detection
    cv2.rectangle(image, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0), 2)

# 顯示結果圖像
cv2.imshow('Face Detection', image)
cv2.waitKey(0)
cv2.destroyAllWindows()