[Ecls-list] [PATCH] Fix locks and condition variables.

Wed Sep 23 08:51:31 UTC 2009

On Wed, Sep 23, 2009 at 12:37 AM, Juan Jose Garcia-Ripoll
<juanjose.garciaripoll at googlemail.com> wrote:
> 2009/9/21 Alexander Gavrilov <angavrilov at gmail.com>:
>> 3) Waiting on a condition variable temporarily gives up the
>>   lock, and restores the ownership before returning. This
>>   should be reflected in the lock structure fields.
>>        Failure mode for 3:
>>         - Thread A waits on a condition (count left at 1)
>>         - Thread B acquires the lock (count = 2)
>>         - Thread B releases the lock (count = 1, holder not reset)
>>         - Thread B acquires the lock
>>           (doesn't actually, since get-lock thinks that it already has it)
>
> I committed as is, but I am still worried that a race condition may
> still happen between steps 1 and 2, if A marks the lock as released,
> but pthread_cond_wait() has not yet released it.

As far as I understand, there are only two properties of a lock that
are meaningful whether the current thread holds the underlying OS lock
or not:

(defun lock-mine-p (lock)
  (eql (lock-holder lock) *current-thread*))

(defun lock-count-mine (lock)
  (if (lock-mine-p lock) (lock-count lock) 0))

You can see that the code that I added to the condition wait functions
never changes these properties for other threads.

The current implementation of get-lock can be summarized as:

(defun get-lock (lock)
  (when (lock-mine-p lock)
    (incf (lock-count lock))
    (return-from get-lock t))
  (pthread:lock lock)
  (setf (lock-count lock) 1
        (lock-holder lock) *current-thread*)
  t)

Since it uses only safe predicates before calling the pthread library,
and a thread cannot have races with itself, its behavior won't be
affected. Of course, the threads should not be interrupted inside lock
managing functions.

>> 4) The lock counter is meaningless if the lock is not held
>>   by the current thread, so it is necessary to check both
>>   lock-count and lock-holder inside with-lock.
>>
>>        Failure mode for 4:
>>         - Thread A acquires the lock, count = 1
>>         - Thread B reads the count
>>         - Thread A releases the lock, count = 0
>>         - Thread B acquires the lock via with-lock, count = 1
>>         - with-lock does not release the lock
>
> Ok, this sounds reasonable. Also committed.

As an afterthought, it may be more elegant to implement the above
lock-count-mine function (maybe in C for efficiency), and use it
inside wait-lock.

Alexander