[Ecls-list] Locking strategy (best so far)

Matthew Mondor mm_lists at pulsar-zone.net
Sat Mar 31 10:13:00 UTC 2012


On Sat, 31 Mar 2012 11:31:21 +0200
Juan Jose Garcia-Ripoll <juanjose.garciaripoll at googlemail.com> wrote:

> On Fri, Mar 30, 2012 at 3:06 AM, Matthew Mondor <mm_lists at pulsar-zone.net>wrote:
> 
> > With the new code (with my FILE-buffering changes or without it), the
> > test httpd reliably locks after load for me, then if I break it in
> > slime the process exits.  I'll try to come up with more details about
> > this soon.
> >
> 
> There were some problems with the interrupt handler -- I made some
> simplifications but in the process another issue popped up: the main
> process was never marked as active. I do not know whether this is the cause
> of the problem, but if it persists, please tell me the steps to reproduce.

Sorry that I couldn't perform other tests yet.  However, I just now
rebuilt the latest ECL HEAD and tested it again, and the same thing
happens.

It appears that it locks before all initial threads of the HTTPd can be
started.  The REPL doesn't come back while that thread appears to be
waiting on a pthread_cond_wait(3) forever without getting an event.
This happens inside the GC, invoked sometime during HTTPD-INIT (that
happens in the first/main thread):

#0  0x00007f7ff6a76eda in ___lwp_park50 () from /usr/lib/libc.so.12
#1  0x00007f7ff72088f4 in pthread_cond_timedwait (abstime=0x0, mutex=0x7f7ff7b010e8, cond=0x7f7ff7b01118) at /usr/src/lib/libpthread/pthread_cond.c:148
#2  pthread_cond_wait (cond=0x7f7ff7b01118, mutex=0x7f7ff7b010e8) at /usr/src/lib/libpthread/pthread_cond.c:193
#3  0x00007f7ff7205cc1 in sem_wait (sem=0x7f7ff60408f0) at /usr/src/lib/libpthread/sem.c:323
#4  0x00007f7ff5e187d8 in GC_stop_world () from /usr/pkg/lib/libgc.so.1
#5  0x00007f7ff5e0bbf0 in GC_stopped_mark () from /usr/pkg/lib/libgc.so.1
#6  0x00007f7ff5e0c2ea in GC_try_to_collect_inner () from /usr/pkg/lib/libgc.so.1
#7  0x00007f7ff5e0cc20 in GC_collect_or_expand () from /usr/pkg/lib/libgc.so.1
#8  0x00007f7ff5e1088c in GC_alloc_large () from /usr/pkg/lib/libgc.so.1
#9  0x00007f7ff5e11207 in GC_generic_malloc_ignore_off_page () from /usr/pkg/lib/libgc.so.1
#10 0x00007f7ff777261f in ecl_alloc_atomic (n=4491264) at /home/mmondor/work/ecl-git/ecl/src/c/alloc_2.d:712
#11 0x00007f7ff771c939 in ecl_stack_set_size (env=0x7f7ff7ed7000, tentative_new_size=<optimized out>) at /home/mmondor/work/ecl-git/ecl/src/c/interpreter.d:42
#12 0x00007f7ff7656f66 in ecl_init_env (env=0x7f7ff7ed7000) at /home/mmondor/work/ecl-git/ecl/src/c/main.d:141
#13 0x00007f7ff777088a in mp_process_enable (process=0x3c82bd0) at /home/mmondor/work/ecl-git/ecl/src/c/threads/process.d:477
#14 0x00007f7ff7770cbd in mp_process_run_function (narg=<optimized out>, name=<optimized out>, function=0x3c43f80)
    at /home/mmondor/work/ecl-git/ecl/src/c/threads/process.d:614
#15 0x00007f7fec019d17 in L35server_init (narg=1) at ecl-mp-server.c:2821
#16 0x00007f7fec4084c9 in L109httpd_init () at test-httpd.c:12904

As for the other successfully-started threads (there are 9 of these,
8 of which are waiting in sigsuspend(2)) after an attempt to obtain the
global accept-lock unsuccessfully (as the first thread acquired it).

#0  0x00007f7ff6a38eca in _sys___sigsuspend14 () from /usr/lib/libc.so.12
#1  0x00007f7ff7206732 in __sigsuspend14 (sigmask=<optimized out>) at /usr/src/lib/libpthread/pthread_cancelstub.c:567
#2  0x00007f7ff777f341 in ecl_wait_on (env=0x7f7ff7ed8000, condition=0x7f7ff777105a <get_lock_inner>, o=0x28b5f90)
    at /home/mmondor/work/ecl-git/ecl/src/c/threads/queue.d:231
#3  0x00007f7ff7771497 in mp_get_lock_wait (lock=0x28b5f90) at /home/mmondor/work/ecl-git/ecl/src/c/threads/mutex.d:165
#4  0x00007f7ff7771528 in mp_get_lock (narg=<optimized out>, lock=<optimized out>) at /home/mmondor/work/ecl-git/ecl/src/c/threads/mutex.d:174
#5  0x00007f7fec01d622 in L57accept_loop_thread (V1=0x3c0e510) at ecl-mp-server.c:6174
#6  0x00007f7ff771c67d in cl_apply (narg=<optimized out>, fun=0x3c43f80, lastarg=<optimized out>) at /home/mmondor/work/ecl-git/ecl/src/c/eval.d:166
#7  0x00007f7ff7770147 in thread_entry_point (arg=0x3c82c40) at /home/mmondor/work/ecl-git/ecl/src/c/threads/process.d:253
#8  0x00007f7ff5e17fb5 in GC_inner_start_routine () from /usr/pkg/lib/libgc.so.1
#9  0x00007f7ff5e14b60 in GC_call_with_stack_base () from /usr/pkg/lib/libgc.so.1
#10 0x00007f7ff7209dbd in pthread__create_tramp (cookie=0x7f7fe7800000) at /usr/src/lib/libpthread/pthread.c:492
#11 0x00007f7ff6a76ef0 in ___lwp_park50 () from /usr/lib/libc.so.12

As for the first thread that acquired the global accept-lock, it seems
to have been waiting in accept(2) as expected, until entering gdb
caused a trap:

#0  0x00007f7ff6a38eca in _sys___sigsuspend14 () from /usr/lib/libc.so.12
#1  0x00007f7ff7206732 in __sigsuspend14 (sigmask=<optimized out>) at /usr/src/lib/libpthread/pthread_cancelstub.c:567
#2  0x00007f7ff5e184be in GC_suspend_handler_inner () from /usr/pkg/lib/libgc.so.1
#3  0x00007f7ff5e18529 in GC_suspend_handler () from /usr/pkg/lib/libgc.so.1
#4  <signal handler called>
#5  0x00007f7ff6a392e8 in accept () from /usr/lib/libc.so.12
#6  0x00007f7ff72061fa in accept (s=<optimized out>, addr=<optimized out>, addrlen=<optimized out>) at /usr/src/lib/libpthread/pthread_cancelstub.c:143
#7  0x00007f7ff3407f36 in LC22socket_accept (V1=0x3a98000) at ext/sockets.c:819
#8  0x00007f7ff771c7a5 in cl_apply (narg=2, fun=0x3329700, lastarg=0x7f7fe9ffea80) at /home/mmondor/work/ecl-git/ecl/src/c/eval.d:141
#9  0x00007f7ff76ced3d in LC2__g4 (narg=<optimized out>, V1=<optimized out>, V2=0x1) at clos/combin.c:130
#10 0x00007f7ff76cebbe in LC4__g5 (narg=<optimized out>, V1=0x7f7fe9ffea80, V2=<optimized out>) at clos/combin.c:164
#11 0x00007f7ff7726996 in _ecl_standard_dispatch (frame=0x7f7fe9ffea80, gf=0x332b3c0) at /home/mmondor/work/ecl-git/ecl/src/c/gfun.d:200
#12 0x00007f7ff7726ad3 in generic_function_dispatch_vararg (narg=<optimized out>) at /home/mmondor/work/ecl-git/ecl/src/c/gfun.d:214
#13 0x00007f7fec01d67e in L57accept_loop_thread (V1=0x3aa8ea0) at ecl-mp-server.c:6175
#14 0x00007f7ff771c67d in cl_apply (narg=<optimized out>, fun=0x3c43f80, lastarg=<optimized out>) at /home/mmondor/work/ecl-git/ecl/src/c/eval.d:166
#15 0x00007f7ff7770147 in thread_entry_point (arg=0x3c82d90) at /home/mmondor/work/ecl-git/ecl/src/c/threads/process.d:253
#16 0x00007f7ff5e17fb5 in GC_inner_start_routine () from /usr/pkg/lib/libgc.so.1
#17 0x00007f7ff5e14b60 in GC_call_with_stack_base () from /usr/pkg/lib/libgc.so.1
#18 0x00007f7ff7209dbd in pthread__create_tramp (cookie=0x7f7fe9800000) at /usr/src/lib/libpthread/pthread.c:492
#19 0x00007f7ff6a76ef0 in ___lwp_park50 () from /usr/lib/libc.so.12

Circumstances:

Very recent netbsd-6/amd64 with ECL built with options:

export CFLAGS="-O2 -g"
export LDFLAGS='-g'
./configure --prefix=/usr/local/ecl --enable-unicode --enable-threads --with-__thread=no --enable-rpath --with-system-boehm=yes --with-system-gmp=yes --with-gmp-prefix=/usr/pkg --with-dffi=system

System libraries (built from pkgsrc-2011Q4):
System GC:     boehm-gc-7.1nb4
System GMP:    gmp-5.0.2nb1
System libffi: libffi-3.0.9nb1

I should also test this on Linux too though, and soon report about it.
-- 
Matt




More information about the ecl-devel mailing list