【🚀人臉辨識打卡機實作 | 臉部關鍵點、CNN、ViT 應用】 - iT 邦幫忙::一起幫忙解決難題，拯救 IT 人的一天

2025 iThome 鐵人賽

DAY 3

生成式 AI

AI 情感偵測：從聲音到表情的多模態智能應用系列第 3 篇

【🚀人臉辨識打卡機實作 | 臉部關鍵點、CNN、ViT 應用】

17th鐵人賽

abc11032203

2025-09-17 12:12:36

408 瀏覽

分享至

「臉」是人類原生且自然的介面，現代科技能從面部辨識身分、分析情緒，舉例來說:臉辨識打卡機可應用在公司辦公室、圖書館的人臉辨識借書機、手機電腦生物辨識解鎖、醫療照護，這項技術目前已廣泛運用在生活中。

從小到大，我們透過臉來認人、讀情緒：
😏 嘴角上揚，你知道對方開心
😡 眉頭一皺，就懂得別再惹了

這次的目標就是結合 OpenCV 的 LBPH 人臉辨識與 FER 表情辨識模組，做出一個「能分辨你是誰，也能記錄當下情緒」的打卡系統!
換句話說——它不只知道你來上班，還知道你是笑著打卡，還是皺著眉頭進來的。

二、程式碼架構
系統主要分成四大模組：

人臉樣本蒐集

def collect_samples(frame, faces, target_name, collected):
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    for (x, y, w, h) in faces:
        face = cv2.resize(gray[y:y+h, x:x+w], (200, 200))
        cv2.imwrite(f"faces/{target_name}/{int(time.time()*1000)}.png", face)
        collected += 1
        break
    return collected

📌 功能：將新人的臉拍攝 20 張存檔，為訓練模型做準備。

LBPH 訓練與辨識

recognizer = cv2.face.LBPHFaceRecognizer_create()
recognizer.train(X, np.array(y))
recognizer.save("face_model.xml")

LBPH 會將臉部分割成小區塊，計算每區塊的局部二值模式，並轉換成直方圖，最後比對不同人臉間的差異。
✅ 優點是輕量快速，非常適合即時辨識與小型系統。

表情辨識（FER 模組）

from fer import FER
emo_detector = FER(mtcnn=True)
res = emo_detector.detect_emotions(frame)

📌 這裡的 FER 內部用 CNN 模型訓練，可以輸出七大類情緒
（happy, sad, angry, fear, disgust, surprise, neutral）。
程式會取最高分情緒，並與人名一起顯示：

[王小明 | Happy]

GUI 打卡機介面（PySimpleGUI）

layout = [
    [sg.Text("人臉辨識打卡機")],
    [sg.Image(key="-IMAGE-")],
    [sg.Button("開啟攝影機"), sg.Button("停止"), sg.Button("開始打卡")],
    [sg.Text("狀態：等待操作", key="-STATUS-")]
]

📌 GUI 介面提供直覺操作：
開啟攝影機
新增人員 → 訓練模型
開始打卡 → 自動紀錄「姓名 + 表情 + 時間」到 CSV 檔案。

完整程式碼如下:

import os, cv2, json, time, csv
from datetime import datetime
import numpy as np
import PySimpleGUI as sg

# ---------------- 基本設定 ----------------
DATA_DIR = "faces"                       # 每個人的臉部樣本資料夾
MODEL_PATH = "face_model.xml"            # LBPH 模型
LABELS_PATH = "labels.json"              # id <-> name 對應
LOG_CSV = "attendance_log.csv"           # 打卡紀錄
SAMPLES_PER_PERSON = 20                  # 新人建檔要收集的張數
RECOG_THRESHOLD = 65                     # LBPH 信心分數（越小越好）；< 65 視為辨識成功
COOLDOWN_SECONDS = 60                    # 同一人多久內不重複記錄（秒）

os.makedirs(DATA_DIR, exist_ok=True)

# ---- 人臉偵測（Haar）----
FACE_CASCADE = cv2.CascadeClassifier(
    cv2.data.haarcascades + "haarcascade_frontalface_default.xml"
)

# ---- 可選：表情辨識（FER）延遲載入，避免環境不符時整個程式掛掉 ----
emo_detector = None
def ensure_fer():
    """嘗試載入 FER（需要 moviepy==1.0.3）"""
    global emo_detector
    if emo_detector is not None:
        return True
    try:
        from fer import FER  # 延遲 import，避免 moviepy 版本不符時一開始就爆
        emo_detector = FER(mtcnn=True)
        return True
    except Exception as e:
        # 這裡不要 raise，讓主程式繼續跑；只是在 UI 告知未啟用即可
        print("[FER] 未啟用：", e)
        emo_detector = None
        return False

def get_top_emotion(frame):
    """回傳最高分情緒字串；FER 未啟用時回傳 'Neutral'"""
    if emo_detector is None:
        return "Neutral"
    try:
        res = emo_detector.detect_emotions(frame)
        if res:
            emos = res[0]["emotions"]
            return max(emos, key=emos.get)
    except Exception:
        pass
    return "Neutral"

# ---- LBPH 模型（需要 opencv-contrib-python）----
def create_recognizer():
    try:
        recognizer = cv2.face.LBPHFaceRecognizer_create()
    except Exception as e:
        raise RuntimeError("需要安裝 opencv-contrib-python 才能使用 LBPH：pip install opencv-contrib-python") from e
    return recognizer

recognizer = None

# --------------- labels 讀寫 ---------------
def load_labels():
    if os.path.exists(LABELS_PATH):
        with open(LABELS_PATH, "r", encoding="utf-8") as f:
            return json.load(f)
    return {"name2id": {}, "id2name": {}}

def save_labels(lbl):
    with open(LABELS_PATH, "w", encoding="utf-8") as f:
        json.dump(lbl, f, ensure_ascii=False, indent=2)

def ensure_person_folder(name):
    path = os.path.join(DATA_DIR, name)
    os.makedirs(path, exist_ok=True)
    return path

# --------------- 蒐集新人樣本 ---------------
def collect_samples(frame, faces, target_name, collected):
    """把每張臉裁成灰階 200x200 存檔，回傳已蒐集數量"""
    if not faces:
        return collected
    folder = ensure_person_folder(target_name)
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    for (x, y, w, h) in faces:
        face = gray[y:y+h, x:x+w]
        face = cv2.resize(face, (200, 200))
        filename = os.path.join(folder, f"{int(time.time()*1000)}.png")
        cv2.imwrite(filename, face)
        collected += 1
        # 只存一張就 break，避免一次抓到多臉影響樣本品質
        break
    return collected

# --------------- 訓練 / 載入模型 ---------------
def train_model():
    lbl = load_labels()
    name2id = lbl["name2id"]
    id2name = {int(k): v for k, v in lbl["id2name"].items()} if lbl["id2name"] else {}

    X, y = [], []
    next_id = max(id2name.keys(), default=-1) + 1

    # 掃描 faces 資料夾，為每個人分配 id
    for person in sorted(os.listdir(DATA_DIR)):
        person_dir = os.path.join(DATA_DIR, person)
        if not os.path.isdir(person_dir):
            continue
        if person not in name2id:
            name2id[person] = next_id
            id2name[next_id] = person
            next_id += 1
        pid = name2id[person]
        for img_name in os.listdir(person_dir):
            if not img_name.lower().endswith((".png", ".jpg", ".jpeg")):
                continue
            img_path = os.path.join(person_dir, img_name)
            img = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)
            if img is None:
                continue
            X.append(img)
            y.append(pid)

    if len(X) < 2:
        raise RuntimeError("樣本不足，至少需 2 張臉圖才能訓練。請先用『新增人員樣本』收集影像。")

    global recognizer
    recognizer = create_recognizer()
    recognizer.train(X, np.array(y))
    recognizer.save(MODEL_PATH)

    # 存 labels
    lbl = {"name2id": name2id, "id2name": {str(k): v for k, v in id2name.items()}}
    save_labels(lbl)
    return len(set(y)), len(X)  # 人數, 影像數

def load_model():
    if not os.path.exists(MODEL_PATH) or not os.path.exists(LABELS_PATH):
        raise RuntimeError("找不到模型或標籤檔，請先『訓練模型』。")
    lbl = load_labels()
    global recognizer
    recognizer = create_recognizer()
    recognizer.read(MODEL_PATH)
    return lbl

# --------------- 打卡紀錄 ---------------
def append_log(name, emotion_top):
    new_file = not os.path.exists(LOG_CSV)
    with open(LOG_CSV, "a", newline="", encoding="utf-8") as f:
        writer = csv.writer(f)
        if new_file:
            writer.writerow(["time", "name", "emotion"])
        writer.writerow([datetime.now().strftime("%Y-%m-%d %H:%M:%S"), name, emotion_top])

# --------------- GUI 佈局 ---------------
sg.theme("SystemDefault")
layout = [
    [sg.Text("人臉辨識打卡機（OpenCV + LBPH）＋（可選）表情辨識（FER）", font=("Microsoft JhengHei", 14), expand_x=True, justification="center")],
    [sg.Image(filename="", key="-IMAGE-", size=(800, 600))],
    [
        sg.Button("開啟攝影機", key="-START-", button_color=("white", "#1f8b4c")),
        sg.Button("停止", key="-STOP-"),
        sg.Text("來源："), sg.Input("0", key="-SRC-", size=(4,1)),
        sg.Button("切換來源", key="-SWITCH-"),
        sg.HorizontalSeparator()
    ],
    [
        sg.Text("新增人員："), sg.Input(key="-NEWNAME-", size=(15,1)),
        sg.Button("開始蒐集樣本", key="-ENROLL-"),
        sg.Text(f"（每人自動收集 {SAMPLES_PER_PERSON} 張臉）"),
    ],
    [
        sg.Button("訓練模型", key="-TRAIN-"),
        sg.Button("載入模型", key="-LOAD-"),
        sg.Button("開始打卡", key="-ATTEND-"),
        sg.Button("開啟紀錄檔", key="-OPENLOG-"),
    ],
    [
        sg.Radio("顯示表情", "EMO", key="-SHOW_EMO-", default=True),
        sg.Checkbox("鏡像", key="-FLIP-", default=True),
        sg.Slider(range=(30,100), default_value=70, resolution=1, orientation="h", key="-SCALE-", size=(25,15)),
        sg.Text("影像縮放%")
    ],
    [sg.Text("狀態：等待操作", key="-STATUS-")]
]

window = sg.Window("Attendance + Emotion Demo", layout, resizable=False)

# --------------- 程式主迴圈 ---------------
cap = None
running = False
attend_mode = False
enroll_mode = False
enroll_target = ""
enroll_count = 0
labels = load_labels()
id2name = {int(k): v for k, v in labels.get("id2name", {}).items()}
last_seen = {}  # name -> timestamp 最近一次打卡時間

def open_capture(index):
    c = cv2.VideoCapture(index, cv2.CAP_DSHOW)  # Windows 建議 CAP_DSHOW
    if not c.isOpened():
        return None
    c.set(cv2.CAP_PROP_FRAME_WIDTH,  800)
    c.set(cv2.CAP_PROP_FRAME_HEIGHT, 600)
    return c

while True:
    event, values = window.read(timeout=10)
    if event in (sg.WINDOW_CLOSED, "Exit"):
        break

    if event == "-START-":
        try:
            idx = int(values["-SRC-"])
        except:
            idx = 0
        cap = open_capture(idx)
        if cap is None:
            window["-STATUS-"].update("狀態：來源開啟失敗，請確認攝影機索引")
            running = False
        else:
            running = True
            window["-STATUS-"].update("狀態：攝影機已開啟")

    elif event == "-STOP-":
        running = False
        attend_mode = False
        enroll_mode = False
        if cap:
            cap.release()
            cap = None
        window["-STATUS-"].update("狀態：已停止")

    elif event == "-SWITCH-":
        if cap:
            cap.release()
            cap = None
        try:
            idx = int(values["-SRC-"])
        except:
            idx = 0
        cap = open_capture(idx)
        if cap is None:
            window["-STATUS-"].update("狀態：來源切換失敗")
            running = False
        else:
            running = True
            window["-STATUS-"].update(f"狀態：來源 {idx} 執行中")

    elif event == "-ENROLL-":
        name = values["-NEWNAME-"].strip()
        if not running:
            window["-STATUS-"].update("狀態：請先開啟攝影機再蒐集樣本")
        elif not name:
            window["-STATUS-"].update("狀態：請輸入『新增人員』名字")
        else:
            enroll_mode = True
            attend_mode = False
            enroll_target = name
            enroll_count = 0
            ensure_person_folder(name)
            window["-STATUS-"].update(f"狀態：開始收集 {name} 樣本（目標 {SAMPLES_PER_PERSON} 張）")

    elif event == "-TRAIN-":
        try:
            people, imgs = train_model()
            labels = load_labels()
            id2name = {int(k): v for k, v in labels.get("id2name", {}).items()}
            window["-STATUS-"].update(f"狀態：訓練完成（人員 {people}、影像 {imgs}）")
        except Exception as e:
            window["-STATUS-"].update(f"狀態：訓練失敗：{e}")

    elif event == "-LOAD-":
        try:
            labels = load_labels()
            id2name = {int(k): v for k, v in labels.get("id2name", {}).items()}
            load_model()
            window["-STATUS-"].update("狀態：模型已載入")
        except Exception as e:
            window["-STATUS-"].update(f"狀態：載入失敗：{e}")

    elif event == "-ATTEND-":
        if not running:
            window["-STATUS-"].update("狀態：請先開啟攝影機")
        elif recognizer is None:
            window["-STATUS-"].update("狀態：尚未載入/訓練模型")
        else:
            attend_mode = True
            enroll_mode = False
            window["-STATUS-"].update("狀態：打卡模式（辨識到名字會記錄）")

    elif event == "-OPENLOG-":
        if os.path.exists(LOG_CSV):
            os.startfile(LOG_CSV)
        else:
            window["-STATUS-"].update("狀態：尚無打卡紀錄")

    # ----------------- 影像循環 -----------------
    if running and cap:
        ok, frame = cap.read()
        if not ok:
            window["-STATUS-"].update("狀態：讀取影像失敗")
            continue

        # 鏡像 & 縮放
        if values["-FLIP-"]:
            frame = cv2.flip(frame, 1)
        scale = max(30, int(values["-SCALE-"])) / 100.0
        if scale != 1.0:
            frame = cv2.resize(frame, None, fx=scale, fy=scale, interpolation=cv2.INTER_AREA)

        gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
        faces = FACE_CASCADE.detectMultiScale(gray, scaleFactor=1.2, minNeighbors=5, minSize=(80, 80))

        # 新人蒐集樣本
        if enroll_mode and enroll_target:
            if enroll_count < SAMPLES_PER_PERSON:
                enroll_count = collect_samples(frame, faces, enroll_target, enroll_count)
                cv2.putText(frame, f"Collecting {enroll_target}: {enroll_count}/{SAMPLES_PER_PERSON}",
                            (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 180, 255), 2)
            else:
                enroll_mode = False
                window["-STATUS-"].update(f"狀態：{enroll_target} 樣本收集完成（{SAMPLES_PER_PERSON} 張），請按『訓練模型』")
        
        # 打卡模式：辨識 + 記錄 + 表情（若可用）
        if attend_mode and recognizer is not None:
            emo_top = "Neutral"
            if values["-SHOW_EMO-"]:
                if emo_detector is None and not ensure_fer():
                    window["-STATUS-"].update("狀態：表情偵測未啟用（缺模組 moviepy==1.0.3 / fer）")
                else:
                    emo_top = get_top_emotion(frame)

            for (x, y, w, h) in faces:
                face_roi = gray[y:y+h, x:x+w]
                face_roi = cv2.resize(face_roi, (200, 200))
                try:
                    pid, conf = recognizer.predict(face_roi)
                except Exception:
                    pid, conf = -1, 999

                if conf < RECOG_THRESHOLD and pid in id2name:
                    name = id2name[pid]
                    color = (0, 200, 0)
                    now = time.time()
                    if name not in last_seen or (now - last_seen[name]) > COOLDOWN_SECONDS:
                        append_log(name, emo_top)
                        last_seen[name] = now
                        window["-STATUS-"].update(f"狀態：已為 {name} 記錄打卡（情緒：{emo_top}）")
                else:
                    name = "Unknown"
                    color = (0, 0, 255)

                cv2.rectangle(frame, (x, y), (x+w, y+h), color, 2)
                label = f"{name}" + (f" | {emo_top}" if values["-SHOW_EMO-"] else "")
                cv2.putText(frame, label, (x, y-8), cv2.FONT_HERSHEY_SIMPLEX, 0.7, color, 2, cv2.LINE_AA)

        # 非打卡模式：只畫臉框，表情（若可用）
        if not attend_mode:
            emo_top = ""
            if values["-SHOW_EMO-"]:
                if emo_detector is None and not ensure_fer():
                    window["-STATUS-"].update("狀態：表情偵測未啟用（缺模組 moviepy==1.0.3 / fer）")
                else:
                    emo_top = get_top_emotion(frame)

            for (x, y, w, h) in faces:
                cv2.rectangle(frame, (x, y), (x+w, y+h), (200, 200, 0), 2)
                if emo_top:
                    cv2.putText(frame, emo_top, (x, y-8), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (200, 200, 0), 2, cv2.LINE_AA)

        # 顯示到 GUI
        frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        imgbytes = cv2.imencode(".png", frame_rgb)[1].tobytes()
        window["-IMAGE-"].update(data=imgbytes)

# 收尾
window.close()
if cap:
    cap.release()
cv2.destroyAllWindows()

環境與套件安裝說明

Python 版本
建議使用 Python 3.8 ~ 3.11，版本太新 (例如 3.12) 可能會遇到某些套件相容性問題。
必要套件
請在專案資料夾下執行以下指令安裝：

pip install opencv-contrib-python
pip install PySimpleGUI
pip install numpy
pip install fer==22.5.0
pip install moviepy==1.0.3

📌 備註：
opencv-contrib-python：一定要裝這個（不是只有 opencv-python），因為 LBPH 人臉辨識器在 contrib 模組裡。
PySimpleGUI：用來做 GUI 介面。
numpy：矩陣運算必備。
fer：表情辨識套件（內部會用到 CNN）。
moviepy==1.0.3：fer 需要的依賴，記得鎖定版本，避免新版缺少 editor.py 出錯

額外需求
OpenCV 的 Haar 模型（隨 opencv 內建）：haarcascade_frontalface_default.xml。
作業系統：Windows / Linux / macOS 皆可。
Windows 建議用 cv2.VideoCapture(index, cv2.CAP_DSHOW) 開相機，避免延遲。
攝影機：電腦內建或 USB 鏡頭。
專案結構
專案運行後，會自動建立以下資料夾與檔案：

my-project/
│ faceidentity.py         # 主程式
│ face_model.xml          # 訓練後的人臉模型
│ labels.json             # id ↔ name 對應表
│ attendance_log.csv      # 打卡紀錄
│
└── faces/                # 存放人臉樣本
    ├── Alice/
    │   ├── 16632488.png
    │   └── ...
    └── Bob/
        ├── 16632501.png
        └── ...