[Ecls-list] threading failures

James M. Lawrence llmjjmll at gmail.com
Wed Sep 2 18:27:14 UTC 2015

On Wed, Sep 2, 2015 at 6:09 AM, Daniel Kochmański <daniel at turtleware.eu> wrote:
> Hm, then I can't reproduce neither of them. Spawning too many threads
> blows the heap, but it's understandable. I think it might be that i have
> x86_64 and a new kernel, (maybe it happens only on x86, or linux 3.2 had
> some bug?).

64-bit systems are sometimes better at concealing such threading bugs.
The default assumption should be that 64-bit systems just haven't been
jiggled to the right tune yet. Over the course of several years, and
across several kernels, I haven't seen any version of ECL pass these
stress tests. This problem would need to be understood before even
thinking about kernel bugs, which seem quite unlikely in this case.

Moving up the ladder of complexity, the next test case requires a
blocking queue and involves repeatedly creating/destroying worker
threads. It has three different types of failures; they are sporadic
and not determined by the inputs. The following are just example runs
and shouldn't be considered representative:

(qtest 0 64) ;=> segfault
(qtest 1 64) ; => hang
(qtest 10000 64) ; => error "Attempted to recursively lock..."

;;;; raw-queue

(defstruct (raw-queue (:conc-name nil))
  (head nil)
  (tail nil))

(defun push-raw-queue (value queue)
  (let ((new (cons value nil)))
    (if (head queue)
        (setf (cdr (tail queue)) new)
        (setf (head queue) new))
    (setf (tail queue) new)))

(defun pop-raw-queue (queue)
  (let ((node (head queue)))
    (if node
        (multiple-value-prog1 (values (car node) t)
          (when (null (setf (head queue) (cdr node)))
            (setf (tail queue) nil))
          (setf (car node) nil
                (cdr node) nil))
        (values nil nil))))

;;;; queue

(defstruct queue
  (impl (make-raw-queue))
  (lock (mp:make-lock))
  (cvar (mp:make-condition-variable)))

(defun push-queue (object queue)
  (mp:with-lock ((queue-lock queue))
    (push-raw-queue object (queue-impl queue))
    (mp:condition-variable-signal (queue-cvar queue))))

(defun pop-queue (queue)
  (mp:with-lock ((queue-lock queue))
    (loop (multiple-value-bind (value presentp)
              (pop-raw-queue (queue-impl queue))
            (if presentp
                (return value)
                 (queue-cvar queue)
                 (queue-lock queue)))))))

;;;; qtest

(defun qtest (message-count worker-count)
  (loop (let ((to-workers (make-queue))
              (from-workers (make-queue)))
          (loop repeat worker-count
                do (mp:process-run-function
                    (lambda ()
                      (loop (let ((message (pop-queue to-workers)))
                              (push-queue message from-workers)
                              (unless message (return)))))))
          (loop repeat message-count do (push-queue t to-workers))
          (loop repeat message-count do (pop-queue from-workers))
          (loop repeat worker-count do (push-queue nil to-workers))
          (loop repeat worker-count do (pop-queue from-workers))
          (format t ".")

More information about the ecl-devel mailing list