並行程式的潛在問題 (三)

2021 iThome 鐵人賽

DAY 30

Software Development

微自幹的作業系統輕旅行系列第 37 篇

13th鐵人賽 semaphore synchronize concurrency

團隊飛天小女警

2021-10-15 11:20:00

10498 瀏覽

分享至

在介紹 Mutex lock 與 Spinlock 後，本篇文章同樣針對並行程式的 Synchronization 作探討，以保證並行程式的執行順序。

Semaphore

Semaphore 是一個同步物件，用於保持在 0 至指定最大值之間的一個計數值。比起之前介紹的 spinlock 和 mutex lock，semaphore 更像是一個指示號誌。
當執行緒完成一次對該 semaphore 物件的等待 (wait) 時，該計數值減一；當執行緒完成一次對 semaphore 物件的釋放 (release) 時，計數值加一。
semaphore 物件的計數值大於 0，為 signaled 狀態；計數值等於 0，為 nonsignaled 狀態.
也就是說，當計數值為 0 時，處於 nonsignald 狀態的 semaphore 是無法供執行緒使用的，除非該 semaphore 再次轉為 signaled 狀態。

Semaphore APIs

使用 Semaphore 前須詳閱公開說明書:

sem_t semaphore;
int sem_init(sem_t *sem, int pshared, unsigned int value);
int sem_wait(sem_t *sem);
int sem_post(sem_t *sem);

在使用 semaphore 前，我們需要使用 sem_init() 將其初始化:

static sem_t semaphore;
sem_init(&semaphore, 0, 1);

透過 Linux manual page 查看 sem_init() 的定義，可以知道它的三個參數從左至右分別為:

sem_t 型別變數的位址。
決定 semaphore 共享於同一個 Process 的 Thread，或是多個 Process。
semaphore 的初始值。

初始化後，通常我們會在 Critical section 間放置 sem_wait() 與 sem_post() 去避免 Race condition。

sem_wait()

sem_wait() 會檢查傳入的 semaphore 變數值，如果大於零，則會將它減一。相反的，如果變數值為零，該執行緒就會被 Block。

sem_post()

當執行緒處理完 Critical section 便可以使用 sem_post() 釋放 Semaphore 物件。這時，傳入的 Semaphore 物件值會被加一。

透過範例程式學習 Semaphore!

比起 Mutex，雖然 Semaphore 同樣可以用來保護 Critical section，不過它更常被用來確保並行程式的執行順序。不像是 Mutex lock 解鈴還需繫鈴人的特性，在 Semaphore 中，每個任務都會消耗與製造不同的號誌，考慮以下程式碼:

static sem_t A_B;
static sem_t B_C;
static sem_t C_A;

void *printA(void *arg)
{
	int i = 0;
	for(i = 1;i<11;i++){
		sem_wait(&C_A);
		printf("第%02d次:A",i);
		sem_post(&A_B);
	}
	return NULL;
}
void *printB(void *arg)
{
	int i = 0;
	for(i = 1;i<11;i++){
		sem_wait(&A_B);
		printf("B");
		sem_post(&B_C);
	}
	return NULL;
}
void *printC(void *arg)
{
	int i = 0;
	for(i = 1;i<11;i++){
		sem_wait(&B_C);
		printf("C");
		sem_post(&C_A);
	}
	return NULL;
}

在 main() 賦予初始值:

pthread_t thread_A;
pthread_t thread_B;
pthread_t thread_C;
sem_init(&A_B,0,0);
sem_init(&B_C,0,0);
sem_init(&C_A,0,1);
pthread_create(&thread_A,NULL,printA,NULL);
pthread_create(&thread_B,NULL,printB,NULL);
pthread_create(&thread_C,NULL,printC,NULL);
//...

透過 Semaphore，我們不止可以避免 Race condition，更可以保證三個 Task 的執行順序。

原始碼連結: Github。

與 Mutex lock 的異曲同工之妙

文章前面有提到，Semaphore 用於保持在 0 至指定最大值之間的一個計數值。
如果開發者將 Semaphore 設計成只會出現 0 和 1 兩種狀態，便稱為 Binary semaphore。
仔細想想，Binary semaphore 同樣確保一次只能有一個執行緒獲得共用資源的存取權，比較不同的是 Binary semaphore 不像 Mutex lock 有 Owner ship 的概念。因此在應用上，Binary semaphore 會更具彈性。

這裡的彈性是指，當一個 Thread 更改 Semaphore 的狀態，程式能夠允許另一個 Thread 對同一個 Semaphore 修改狀態。換句話說，Binary semaphore 可以讓我們造出一個能有它人解鎖的 Mutex lock。
關於這個問題，網路上也有人設計相關實驗作探討，有興趣的讀者請參考該連結。

Deadlock 與 Livelock

使用鎖或是 Semaphore 確實可以避免 Race condition 的發生，不過，當開發者在設計機制時沒有考慮周全，就有可能讓程式邏輯打結。

Deadlock

簡單來說，當有兩個 Process 或是 Thread 需要彼此目前佔有的資源，而兩者卻又不願意釋放資源時，便會讓彼此的進度卡住，形成死結。又好比兩個人做交易，買家堅持賣家先出貨，賣家則堅持買家先付款，導致交易僵持，如果在電腦中出現類似的情況，就稱為 Deadlock。

Deadlock 範例

考慮以下程式碼:

// Source: https://stackoverflow.com/questions/27480125/simple-deadlock-example-using-pthread/50111909
pthread_mutex_t lock1, lock2;

void *resource1(){

    pthread_mutex_lock(&lock1);

    printf("Job started in resource1..\n");
    sleep(2);

    printf("Trying to get resourc2\n");
    pthread_mutex_lock(&lock2); 
    printf("Aquired resourc2\n");
    pthread_mutex_unlock(&lock2);

    printf("Job finished in resource1..\n");

    pthread_mutex_unlock(&lock1);

    pthread_exit(NULL);

}

void *resource2(){

    pthread_mutex_lock(&lock2);

    printf("Job started in resource2..\n");
    sleep(2);

    printf("Trying to get resourc1\n");
    pthread_mutex_lock(&lock1); 
    printf("Aquired resourc1\n");
    pthread_mutex_unlock(&lock1);

    printf("Job finished in resource2..\n");

    pthread_mutex_unlock(&lock2);

    pthread_exit(NULL);

}



int main() {

    pthread_mutex_init(&lock1,NULL);
    pthread_mutex_init(&lock2,NULL);

    pthread_t t1,t2;

    pthread_create(&t1,NULL,resource1,NULL);
    pthread_create(&t2,NULL,resource2,NULL);

    pthread_join(t1,NULL);
    pthread_join(t2,NULL);

    return 0;

}

在上面的範例中，兩個 Task 都會嘗試 Lock 兩個互斥鎖，一個 Task 先上鎖 lock1 再上鎖 lock2，另一個則相反。當這兩個 Task 嘗試上獲得第二道鎖時，就會因為對方遲遲不交出資源而陷入 Deadlock 的狀況。

Deadlock 觸發條件

要讓程式進入死結，必須天時地利人和:

No preemption：占用系統資源的一個任務無法被強制停止。
Hold and wait：一個行程可以在等待時持有系統資源。
Mutual exclusion：系統資源只能同時分配給一個行程，無法在多個行程共享。
Circular waiting：多個行程互相持有其他行程所需要的資源。

要達成以上條件，程式才有可能進入 Deadlock，作業系統在做排程設計時也會將其考慮進去，做出應對機制，像是: 讓任務是可以被作業系統強制結束的。

Starvation

當一個 Process 遲遲無法獲得所需資源，進入長期等待，這種情況就可以稱為 Starvation (飢餓)。作業系統為了避免飢餓/死結發生，除了上面提到的可搶斷式排程，也添加了老化的機制，也就是: 當一個 Process 久久沒有執行完成，作業系統便會逐步調高它的優先權，增加該 Process 完成的速度。