Mutex

第 11 屆 iThome 鐵人賽

DAY 8

自我挑戰組

Linux Inside - Synchronization & Interrupt系列第 8 篇

11th鐵人賽

ian910297

2019-09-09 12:21:47

6282 瀏覽

分享至

參考 Synchronization primitives in the Linux kernel. Part 4.

Mutex(MUTual EXclusion) 資料結構

顧名思義就是一次只能有一個程式進入 CS(Critical Section)，類似前一天所提及的 Binary Semaphore，但兩者又不盡相同，因為還沒介紹完，也不好作整理比較，所以就直接從資料結構開始看吧

struct semaphore {
    raw_spinlock_t        lock;
    unsigned int        count;
    struct list_head    wait_list;
};

struct mutex {
        atomic_t                count;
        spinlock_t              wait_lock;
        struct list_head        wait_list;
#if defined(CONFIG_DEBUG_MUTEXES) || defined(CONFIG_MUTEX_SPIN_ON_OWNER)
        struct task_struct      *owner;
#endif
#ifdef CONFIG_MUTEX_SPIN_ON_OWNER
        struct optimistic_spin_queue osq;
#endif
#ifdef CONFIG_DEBUG_MUTEXES
        void                    *magic;
#endif
#ifdef CONFIG_DEBUG_LOCK_ALLOC
        struct lockdep_map      dep_map;
#endif
};

count 是用來表示現在 Lock 的狀態，1 代表 unlocked，0 代表 locked
count 有可能是負數，代表現在是 locked 且有程式正在排隊等待
wait_lock 用來保護 wait_list 的 Lock
wait_list 等待的序列

只看上面三個元素的話，會發現與 Semaphore 的結構非常相似

BTW 昨天沒有提到 Semaphore 結構中的 raw_spinlock_t 是用來保護 Semaphore 中的資料，不過從程式碼也可以發現啦，因為他最終是圍繞 waiter.up 去作 Spin，又或者是 schedule_timeout 去作睡眠

那再來把目光轉移到 Mutex 獨有的資料結構

owner 用來辨認是哪一個程式獲得資料，struct task_struct 是 Linux 中非常常見的資料結構，但他參數有點複雜，有多餘的天數再獨立一天來研究
osq 是指 optimistic spinning，指說我覺得很樂觀，很快就可以拿到 Lock ，所以我們 Spin 一下也不會怎樣，不需要作 Context Switch
magic dep_map Debug 使用

struct mutex_waiter {
        struct list_head        list;
        struct task_struct      *task;
#ifdef CONFIG_DEBUG_MUTEXES
        void                    *magic;
#endif
};

這是 slowpath 專用的資料結構，與 semaphore_waiter 類似，都是用作排隊

那麼 slowpath 是什麼呢？？Mutex 在 Linux 中的實作可以再細分為三種

fastpath
假設目前沒人獲得 Lock，count 的數值可以直接作修改，但要保證他是原子操作
midpath
假設目前有人獲得 Lock，midpath 與 optimistic spinning 都是嘗試要使用 MCS Lock 的方式作 Spin，這是為了避免無謂的 Context Switch
slowpath
假設已經有人獲得 Lock，且已經有一個人在進行 optimistic spinning，那麼就會進入 slowpath 進行 Context Switch，好好去排隊

程式碼閱讀

Initial

昨天的 Semaphore 沒有閱讀 Initial 的部份，因為我覺得那內容實在太水了，但 Mutex 的 Initial 這次有使用到一個 Macro Debug 的技巧「Parsing Token」，並不包含在我們提到的 Linux Kernel Coding 的範疇，所以就稍微看一下吧

include/linux/mutex.h

#define __MUTEX_INITIALIZER(lockname) \
		{ .owner = ATOMIC_LONG_INIT(0) \
		, .wait_lock = __SPIN_LOCK_UNLOCKED(lockname.wait_lock) \
		, .wait_list = LIST_HEAD_INIT(lockname.wait_list) \
		__DEBUG_MUTEX_INITIALIZER(lockname) \
		__DEP_MAP_MUTEX_INITIALIZER(lockname) }

注意 owner 被初始化為 0

include/linux/mutex.h

#define mutex_init(mutex)						\
do {									\
	static struct lock_class_key __key;				\
									\
	__mutex_init((mutex), #mutex, &__key);				\
} while (0)

這是這次我想重點介紹的項目，如果你在 Macro 中在一個變數前面加上 # 號，那麼就會印出那個變數的名字，寫 C 語言時，相信很多人都使用過 printf 來 Debug，其實可以把 printf 再次打包，使用 Macro 來 Debug 還可以有觀看變數名稱的功能，這樣更容易抓到在哪邊出錯

參考用macro的技巧

#define print_var(var)  printf("%s: %s\n", #var, var)
#define PRINT_TOKEN(token) printf(#token " is %d\n", token)
#define print_three_var(var) do { \
         print_var(var); \
         print_var(var##2); \
         print_var(var##3); \
     } while (0)

int main() {
    char s[] = "aaa";
    char s1[] = "aaa1";
    char s2[] = "aaa2";
    //s: aaa
    print_var(s);
    int a = 5;
    //a is 5
    PRINT_TOKEN(a);

    return 0;
}

BTW 如果要 Debug 我都是使用 GDB ，GDB 更好用@@

kernel/locking/mutex.c

void
__mutex_init(struct mutex *lock, const char *name, struct lock_class_key *key)
{
	atomic_long_set(&lock->owner, 0);
	spin_lock_init(&lock->wait_lock);
	INIT_LIST_HEAD(&lock->wait_list);
#ifdef CONFIG_MUTEX_SPIN_ON_OWNER
	osq_lock_init(&lock->osq);
#endif

	debug_mutex_init(lock, name, key);
}

只是各種初始化 ...

Lock

接下來看核心邏輯是什麼吧，重點在於了解 fastpath midpath slowpath 的實作

void __sched mutex_lock(struct mutex *lock)
{
	might_sleep();

	if (!__mutex_trylock_fast(lock))
		__mutex_lock_slowpath(lock);
}

void down(struct semaphore *sem)
{
	unsigned long flags;

	raw_spin_lock_irqsave(&sem->lock, flags);
	if (likely(sem->count > 0))
		sem->count--;
	else
		__down(sem);
	raw_spin_unlock_irqrestore(&sem->lock, flags);
}

might_sleep 是 Debug 用，無關程式邏輯，可以回顧一下 Semaphore 的 down() ，就會發現他們邏輯架構上很類似，但因為 Mutex 還需要設定不少東西，所以多包了幾層涵式

把目光轉移到 fastpath 上

static __always_inline bool __mutex_trylock_fast(struct mutex *lock)
{
	unsigned long curr = (unsigned long)current;

	if (!atomic_long_cmpxchg_acquire(&lock->owner, 0UL, curr))
		return true;

	return false;
}

cmpxchg(&lock->owner, 0UL, curr) => cmpxchg(ptr, old, new) ，反正就是檢查 ptr 跟 old 的值是否相同，相同的話就可以塞 new 進去，這邊就是在檢查 Owner 是誰

static noinline void __sched
__mutex_lock_slowpath(struct mutex *lock)
{
	__mutex_lock(lock, TASK_UNINTERRUPTIBLE, 0, NULL, _RET_IP_);
}

static int __sched
__mutex_lock(struct mutex *lock, long state, unsigned int subclass,
	     struct lockdep_map *nest_lock, unsigned long ip)
{
	return __mutex_lock_common(lock, state, subclass, nest_lock, ip, NULL, false);
}

可以發現 Mutex 跟 Semaphore 真的是該死的類似，Semaphore 在實作的時候一直呼叫 down down_common ＝＝，而 Mutex 則是 __mutex_lock __mutex_lock_common，那麼重點一樣是 xxxx_common 這個涵式，但由於 __mutex_lock_common 過於冗長，所以我們就用文字敘述一下他在幹麻就好了

大量使用到一個涵式 container_of，這是用來將一個結構中的元素，強制轉型為該結構

#ifndef container_of
/**
 * container_of - cast a member of a structure out to the containing structure
 * @ptr:	the pointer to the member.
 * @type:	the type of the container struct this is embedded in.
 * @member:	the name of the member within the struct.
 *
 */
#define container_of(ptr, type, member) ({			\
    const typeof(((type *)0)->member) * __mptr = (ptr);	\
    (type *)((char *)__mptr - offsetof(type, member)); })
#endif

EX: struct ww_mutex *ww = container_of(lock, struct ww_mutex, base);，這表示要把 lock (struct mutex) 強轉為 struct ww_mutex 的型態

midpath 部份的實作是指讓 optmistic_spin 去作旋轉，詳細實作可以參閱 mutex_optimistic_spin
slowpath
這我真的是不知道有什麼重點，就是好好排隊
ww_mutex
這指的是 Would/Wait Mutex ，是一個為了避免 Deadlock 來影響 Mutex 設計的一系列方法，但我不是很想了解他實作的方法，我們只看他設計的初衷是什麼

Would/Wait Mutex

參考

他假設的場景是 Livelock 的案例，有兩個 Process 要使用兩個資源，我們稱資源為 Buffer

但是很這兩個 Process 要求資源的順序剛好相反，所以就會產生 Deadlock，而 Would/Wait Mutex 就是為了處理這種問題而設計的

從上圖我們可以看到有 T1, T2，各自持有一個 Buffer，那麼想要去持有對方所持有的 Buffer 時，就會發生 Deadlock，處理這類的 Deadlock 有以下兩種演算法(這類 Deadlock Handling 演算法跟 RDBMS 有關)

wait-die: 允許舊的 Process 等待，殺死年輕的 Process
T2 執行時間較早(運行較久，較老)，T1 把 Lock 給 T2，T2 直接死亡
T2 執行時間較晚(運行短，較年輕)，T1 把 Lock 給 T2，T1 進行等待
would-wait:
T2 執行時間較早(運行較久，較老)，駁回 T1，T2 取得 Lock
T2 執行時間較晚(運行短，較年輕)，T2 進行等待

反正就是要有一個規則，根據程式運行時間來決定讓誰去死阿，誰去等待阿，重點是可以避免 Deadlock 產生，這部份就不探討他怎麼實作