pthread

astrotycoon 2015-01-28

展开全文

http:///2012/07/27/pthread_cond_wait_mutex_while/

POSIX的条件变量(condition variable)的pthread_cond_wait通常会这样被使用：

1
//在线程1中

2
 pthread_mutex_lock(&mutex);

3
 while(SOME_CONDITION_1)

4
   pthread_cond_wait(&condition_var, &mutex);

5
 do_work();

6
 pthread_mutex_unlock(&mutex);

7
 
8
//在线程2中

9
 pthread_mutex_lock( &mutex );

10
 if(SOME_CONDITION_2)

11
   pthread_cond_signal(&condition_var);

12
 do_some_work();

13
 pthread_mutex_unlock( &mutex );

这里有两个问题需要解释：

为什么这儿需要mutex
为什么在线程1里需要有一个while循环

为什么这儿需要mutex

总的来说，mutex在这儿是保护条件变量(condition_var)这个临界区的，如果没有mutex，pthread_cond_wait()有可能错过pthread_cond_signal()。详细的原因需要进一步解释。首先，我们需要了解一下pthread_cond_wait()的内部实现，它的实现可以用伪代码表示成如下：（并且需要注意的是这整个操作是原子的(atomatical），具体的实现上比如在RTEMS的POSIX实现里，整个函数会禁止调度器调度。)

1
mutex_unlock(&mutex); //这里释放传递进pthread_cond_wait()的mutex

2
wait_condtition_variable(&condition_var);//释放之后再等待别的线程的条件变量信号

3
mutex_lock(&mutex);//再获得传递进pthread_cond_wait()的mutex

以下的例子1用于说明为什么这儿需要Mutex，这个例子有点稍长，但是仔细读的话不难看懂。假设这儿不用Mutex，那么，在如下例子中可以插入代码注释中的第58行和第61行。

1
#include <pthread.h>

2
#include <stdio.h>

3
#include <stdlib.h>

4
 
5
#define NUM_THREADS  3

6
#define TCOUNT 10

7
#define COUNT_LIMIT 12

8
 
9
int count = 0;

10
int thread_ids[3] = {0,1,2};

11
pthread_mutex_t count_mutex;

12
pthread_cond_t count_threshold_cv;

13
 
14
void *inc_count(void *t)

15
{

16
  int i;

17
  long my_id = (long)t;

18
  for (i=0; i<TCOUNT; i++){

19
    pthread_mutex_lock(&count_mutex);

20
    count++;

21
    /*

22
    Check the value of count and signal waiting thread when condition is

23
    reached.  Note that this occurs while mutex is locked.

24
    */

25
    if (count == COUNT_LIMIT) {

26
      pthread_cond_signal(&count_threshold_cv);

27
      printf("inc_count(): thread %ld, count = %d  Threshold reached.\n", my_id, count);

28
    }

29
    printf("inc_count(): thread %ld, count = %d, unlocking mutex\n", my_id, count);

30
    pthread_mutex_unlock(&count_mutex);

31
 
32
/* Do some "work" so threads can alternate on mutex lock */

33
    sleep(1);

34
  }

35
  pthread_exit(NULL);

36
}

37
 
38
void *watch_count(void *t)

39
{

40
  long my_id = (long)t;

41
  printf("Starting watch_count(): thread %ld\n", my_id);

42
/*

43
Lock mutex and wait for signal.  Note that the pthread_cond_wait

44
routine will automatically and atomically unlock mutex while it waits.

45
Also, note that if COUNT_LIMIT is reached before this routine is run by

46
the waiting thread, the loop will be skipped to prevent pthread_cond_wait

47
from never returning.

48
*/

49
  pthread_mutex_lock(&count_mutex);

50
  while (count<COUNT_LIMIT) {

51
    //如果不需要mutex,可以在这里插入如下一行

52
    //pthread_mutex_unlock(&count_mutex);

53
    pthread_cond_wait(&count_threshold_cv, &count_mutex);

54
    //如果不需要mutex,可以在这里插入如下一行

55
    //pthread_mutex_lock(&count_mutex);

56
    printf("watch_count(): thread %ld Condition signal received.\n", my_id);

57
    count += 125;

58
    printf("watch_count(): thread %ld count now = %d.\n", my_id, count);

59
  }

60
  pthread_mutex_unlock(&count_mutex);

61
  pthread_exit(NULL);

62
}

63
 
64
int main (int argc, char *argv[])

65
{

66
  int i, rc;

67
  long t1=1, t2=2, t3=3;

68
  pthread_t threads[3];

69
  pthread_attr_t attr;

70
  /* Initialize mutex and condition variable objects */

71
  pthread_mutex_init(&count_mutex, NULL);

72
  pthread_cond_init (&count_threshold_cv, NULL);

73
  /* For portability, explicitly create threads in a joinable state */

74
  pthread_attr_init(&attr);

75
  pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);

76
  pthread_create(&threads[0], &attr, watch_count, (void *)t1);

77
  pthread_create(&threads[1], &attr, inc_count, (void *)t2);

78
  pthread_create(&threads[2], &attr, inc_count, (void *)t3);

79
 
80
  /* Wait for all threads to complete */

81
  for (i=0; i<NUM_THREADS; i++) {

82
    pthread_join(threads[i], NULL);

83
  }

84
  printf ("Main(): Waited on %d  threads. Done.\n", NUM_THREADS);

85
 
86
  /* Clean up and exit */

87
  pthread_attr_destroy(&attr);

88
  pthread_mutex_destroy(&count_mutex);

89
  pthread_cond_destroy(&count_threshold_cv);

90
  pthread_exit(NULL);

91
}

这样的话，在执行完第58行之后，并且在执行第61行的pthread_cond_wait()之前，count_mutex已经被释放。如果这个时候inc_count()线程跳进来运行，当inc_count()执行完pthread_cond_signal()之后，条件变量的信号就被触发了。但是我们的pthread_cond_wait()还没有执行，这样条件变量就被错过了。这就是在这里需要Mutex的原因。

但是，如果你仔细一想，会发现即使有了Mutex，执行inc_count()的线程完全有可能比watch_count线程先运行完，那么由inc_count()发出的条件变量信号还不是被错过啦？这怎么可以呢？（xx：导演，剧情不应该是这样的啊！？）的确，剧情不是这样的。。。

因为，在这个例子中，如果执行inc_count()的线程1（这儿假设执行inc_count()的线程为线程1和线程2,执行watch_count的线程为线程3）先于watch_count执行完了，那么count变量的值为9。这样就有两种情况发生：

线程2再次先于线程3运行，并且直接执行完成。那么，线程3就会“错过”两次条件变量信号。但是，这就是程序的逻辑，因为这个时候count的值为18，count<COUNT_LIMIT为假，就不会进入while循环，pthread_cond_wait()根本不会运行。这和前面说的如果没有Mutex，pthread_cond_wait()错过信号是两码事了。
线程3先于线程2运行，那么，线程3首先会进入pthread_cond_wait()等待线程2发出条件变量信号对其进行换醒。

这样，在这个例子中，条件变量信号并没有被错过。其实之前所提到的在有Mutex的情况下条件变量仍然被错过的情况只是发生在你本身程序的逻辑有问题的时候。如果你正确使用条件变量，并不会导致这样的结果。

为什么要有while循环

另外一个在条件变量的使用中常有的疑问是为什么这儿有一个while，我用if代替while可以吗？答案是不可以。因为在系统中可能换醒pthread_cond_wait()的信号不是来自pthread_cond_signal()，这种不是由pthread_cond_signal()导致的换醒称为spurious wakeup，这篇博客更进一步的解释了spurious wakeup的来源。由于spurious wakeup的存在，pthread_cond_wait()在返回之后，可能真正的条件（比如上面例子中的count<COUNT_LIMIT）并没有被满足。为了避免这样的情况发生，所以有必要再检查一遍条件有没有被满足，所以这儿的while也就不能被if代替了。