[Ecls-list] threading failures

James M. Lawrence llmjjmll at gmail.com
Tue Sep 1 15:15:30 UTC 2015


The version with a homemade semaphore is

(defstruct sema
  (count 0)
  (lock (mp:make-lock :recursive nil))
  (cvar (mp:make-condition-variable)))

(defun inc-sema (sema)
  (mp:with-lock ((sema-lock sema))
    (incf (sema-count sema))
    (mp:condition-variable-signal (sema-cvar sema))))

(defun dec-sema (sema)
  (mp:with-lock ((sema-lock sema))
    (loop (cond ((plusp (sema-count sema))
                 (decf (sema-count sema))
                 (return))
                (t
                 (mp:condition-variable-wait
                  (sema-cvar sema) (sema-lock sema)))))))

(defun test (message-count worker-count)
  (let ((to-workers (make-sema))
        (from-workers (make-sema)))
    (loop repeat worker-count
          do (mp:process-run-function
              "test"
              (lambda ()
                (loop
                   (dec-sema to-workers)
                   (inc-sema from-workers)))))
    (loop
       (loop repeat message-count
             do (inc-sema to-workers))
       (loop repeat message-count
             do (dec-sema from-workers))
       (assert (zerop (sema-count to-workers)))
       (assert (zerop (sema-count from-workers)))
       (format t ".")
       (finish-output))))

(defun run ()
  (test 10000 64))

RUN fails with:

Condition of type: SIMPLE-ERROR
Attempted to recursively lock #<lock (nonrecursive) 0a4597f8> which is
already owned by #<process "test">

In the previous test case, by "hang" I meant that it hangs
indefinitely, as opposed to printing dots in spurts. Both these cases
fail within seconds for me, sometimes immediately. They should be
compiled. Increasing the number of threads (second argument to TEST)
will typically cause a quicker failure in these kinds of stress tests.
4-core machine:

Linux xi 3.2.0-24-generic-pae #39-Ubuntu SMP Mon May 21 18:54:21 UTC
2012 i686 i686 i386 GNU/Linux

(:NEW :LINUX :FORMATTER :ECL-WEAK-HASH :LITTLE-ENDIAN :ECL-READ-WRITE-LOCK
 :LONG-LONG :UINT64-T :UINT32-T :UINT16-T :RELATIVE-PACKAGE-NAMES :LONG-FLOAT
 :UNICODE :DFFI :CLOS-STREAMS :CMU-FORMAT :UNIX :ECL-PDE :DLOPEN :CLOS :THREADS
 :BOEHM-GC :ANSI-CL :COMMON-LISP :IEEE-FLOATING-POINT :PREFIXED-API :FFI
 :PENTIUM3 :COMMON :ECL)


On Tue, Sep 1, 2015 at 1:04 AM, Daniel Kochmański <daniel at turtleware.eu> wrote:
> Hello,
>
> that's probably my fault, sorry. I've migrated bugs manually and
> probably missed this one (I remember this bug! but can't find anywhere).
>
> I'm adding it to regression tests in repository, thanks!  Yes, old
> reports are unfortunately lost.
>
> As a sienote, please use ecl-devel at common-lisp.net mailing list – I'm
> closing the old one today. You can subscribe here
> https://mailman.common-lisp.net/listinfo/ecl-devel . All archives before
> 2015-08-10 are imported to the new one and gmane stream is redirected
> (if you use it).
>
> Regards,
> Daniel
>
> James M. Lawrence writes:
>
>> Hello, the threading bugs I reported a while ago appear to have not
>> survived the migration from sourceforge, and the old pages are now
>> 404'd. There were a number of test cases, including
>>
>> (defun test (message-count worker-count)
>>   (let ((to-workers (mp:make-semaphore))
>>         (from-workers (mp:make-semaphore)))
>>     (loop repeat worker-count
>>           do (mp:process-run-function
>>               "test"
>>               (lambda ()
>>                 (loop
>>                    (mp:wait-on-semaphore to-workers)
>>                    (mp:signal-semaphore from-workers)))))
>>     (loop
>>        (loop repeat message-count
>>              do (mp:signal-semaphore to-workers))
>>        (loop repeat message-count
>>              do (mp:wait-on-semaphore from-workers))
>>        (assert (zerop (mp:semaphore-count to-workers)))
>>        (assert (zerop (mp:semaphore-count from-workers)))
>>        (format t ".")
>>        (finish-output))))
>>
>> (defun run ()
>>   (test 10000 64))
>>
>> RUN will eventually hang on all versions of ECL I've tried, including
>> the latest. Another test case was a variant of the above using a
>> homemade semaphore. I can rewrite that and other test cases, but
>> before doing so I'd like to know whether the old reports are really
>> lost or have survived in some form.
>>
>> Best,
>> lmj
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> Ecls-list mailing list
>> Ecls-list at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/ecls-list
>
> --
> Daniel Kochmański | Poznań, Poland
> ;; aka jackdaniel
>
> "Be the change that you wish to see in the world." - Mahatma Gandhi



More information about the ecl-devel mailing list