This seems to be a good compromise, using the underlying operating system for waiting and signaling and using a fast atomic path for detecting the lock-free case. First the simple mutex<div><div><br></div><div>cl_object</div>
<div>mp_get_lock_wait(cl_object lock)</div><div>{</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>if (ecl_atomic_queue_list(lock) != Cnil ||</div><div><span class="Apple-tab-span" style="white-space:pre"> </span> mp_get_lock_nowait(lock) == Cnil) {</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>ecl_wait_on(get_lock_inner, lock);</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>}</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>@(return Ct)</div>
<div>}</div><div><br></div></div><div>Note that the first part of the conditional is what makes this a fair implementation: if there are others waiting, we respect their precedence. The nowait locking is very simple as well</div>
<div><br></div><div><div>static cl_object</div><div>get_lock_inner(cl_env_ptr env, cl_object lock)</div><div>{</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>cl_object output;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>cl_object own_process = env->own_process;</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>ecl_disable_interrupts_env(env);</div><div> if (AO_compare_and_swap_full((AO_t*)&(lock->lock.owner),</div><div><span class="Apple-tab-span" style="white-space:pre"> </span> (AO_t)Cnil, (AO_t)own_process)) {</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>lock->lock.counter = 1;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>output = Ct;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>print_lock("acquiring\t", lock, lock);</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>} else [...]</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>ecl_enable_interrupts_env(env);</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>return output;</div>
<div>}</div><div><br></div><div>Now comes the hard part, which is waiting for the lock (or condition variable or semaphore) to change. Apart from the polling version which is in CVS, I came up with this other version which relies on POSIX signals and seems to work reasonably fast -- at least as fast as the POSIX mutexes on the same platform.</div>
<div><br></div><div><div>void</div><div>ecl_wait_on(cl_object (*condition)(cl_env_ptr, cl_object), cl_object o)</div><div>{</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>const cl_env_ptr the_env = ecl_process_env();</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>volatile cl_object own_process = the_env->own_process;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>volatile cl_object record;</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>volatile sigset_t original;</div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>/* 0) We reserve a record for the queue. In order to a void</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span> * using the garbage collector, we reuse records */</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>record = own_process->process.queue_record;</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>unlikely_if (record == Cnil) {</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>record = ecl_list1(own_process);</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>} else {</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>own_process->process.queue_record = Cnil;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>}</div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>/* 1) First we block all signals. */</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>{</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>sigset_t empty;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>sigemptyset(&empty);</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>pthread_sigmask(SIG_SETMASK, &original, &empty);</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>}</div><div><br></div><div>
<span class="Apple-tab-span" style="white-space:pre"> </span>/* 2) Now we add ourselves to the queue. In order to avoid a</div><div><span class="Apple-tab-span" style="white-space:pre"> </span> * call to the GC, we try to reuse records. */</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>ecl_atomic_queue_nconc(the_env, o->lock.queue_list, record);</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>own_process->process.waiting_for = o;</div>
<div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>CL_UNWIND_PROTECT_BEGIN(the_env) {</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>/* 3) At this point we may receive signals, but we</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span> * might have missed a wakeup event if that happened</div><div><span class="Apple-tab-span" style="white-space:pre"> </span> * between 0) and 2), which is why we start with the</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span> * check*/</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>cl_object queue = ECL_CONS_CDR(o->lock.queue_list);</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>if (ECL_CONS_CAR(queue) != own_process ||</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span> condition(the_env, o) == Cnil)</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>{</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>do {</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>/* This will wait until we get a signal that</div><div><span class="Apple-tab-span" style="white-space:pre"> </span> * demands some code being executed. Note that</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span> * this includes our communication signals and</div><div><span class="Apple-tab-span" style="white-space:pre"> </span> * the signals used by the GC. Note also that</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span> * as a consequence we might throw / return</div><div><span class="Apple-tab-span" style="white-space:pre"> </span> * which is why need to protect it all with</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span> * UNWIND-PROTECT. */</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>sigsuspend(&original);</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>} while (condition(the_env, o) == Cnil);</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>}</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>} CL_UNWIND_PROTECT_EXIT {</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>/* 4) At this point we wrap up. We remove ourselves</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span> from the queue and restore signals, which were */</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>own_process->process.waiting_for = Cnil;</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>ecl_atomic_queue_delete(the_env, o->lock.queue_list, own_process);</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>own_process->process.queue_record = record;</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>ECL_RPLACD(record, Cnil);</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>pthread_sigmask(SIG_SETMASK, NULL, &original);</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>} CL_UNWIND_PROTECT_END;</div><div>}</div><div><br></div><div>I am now working on small optimizations of this code and further testing. I believe the same code can be used for Windows, because there we use a type of signals that are not delivered until the code enters some specific functions. Thus we could get rid of pthread_sigmask() (which do not exist in Windows) and just replace sigsuspend() with a simple SleepEx().</div>
<div><br></div><div>Juanjo</div><div><br></div>-- <br>Instituto de Física Fundamental, CSIC<br>c/ Serrano, 113b, Madrid 28006 (Spain) <br><a href="http://juanjose.garciaripoll.googlepages.com" target="_blank">http://juanjose.garciaripoll.googlepages.com</a><br>
</div></div>