What is the best version of ECL to use right now?

Dennis Ogbe do at ogbe.net
Mon Jun 10 17:15:59 UTC 2019


Hello Daniel,

thanks for taking the time to look at this. After taking a deep dive debugging and finally being able to reproduce the problem, I quickly realized that I had caused all this havoc by mixing C++ and C semantics all in the pursuit of trying to save myself from writing one more line of code...

Long story short, I was somehow piecing together cl_objects from other cl_objects which had already been "destroyed" by a transient std::vector I was using somewhere. Any code that worked with these objects worked most of the time (calling free(...) does not mean the things you are free-ing are not in memory anymore...), but crashed every once in a while since it was working with free'd memory.

Since I fixed my main problem I haven't had the time to investigate the issue with the single-threaded builds I have. Maybe It'll come up some other day, it seems to be working just fine outside of my program.

Anyways, thanks for your help!

Dennis

Daniel Kochmański <daniel at turtleware.eu> writes:

>> [...] there are multiple more calls to Lxxstore_object() methods below this
>>
>> I am having problems debugging this because I highly doubt that the generic function dispatch mechanism is broken (otherwise *nothing ever* would work, right?) So I think something else is causing this confusion in fill_spec_vector.
>
> It is hard to tell anything without a reproducible test case I could use. Please replace the if/else in fill_spec_vector with:
>
> <<<EOF
>
>     if (ECL_LISTP(spec_type) &&
>         !Null(eql_spec = ecl_memql(args[spec_position], spec_type))) {
>       argtype[spec_no++] = eql_spec;
>     } else {
>       printf("XXX: args: %p, spec-pos: %d, args[sp]: %p\n", args, spec_position, args[spec_position]);
>       printf("XXX: printing argument\n");
>       ecl_print(args[spec_position], ECL_T);
>       ecl_terpri(ECL_T);
>       printf("XXX: printing argument type\n");
>       ecl_print(ecl_type_to_symbol(ecl_t_of(args[spec_position])), ECL_T);
>       ecl_terpri(ECL_T);
>       printf("XXX: debug information done\n");
>       argtype[spec_no++] = cl_class_of(args[spec_position]);
>     }
>
> EOF
>
> it could be that the dispatch mechanism misses one particular type, or that you have a dangling pointer, I wouldn't be so sure that all works correct. Please compile ECL with this debug information and when you reproduce the issue send the console output before the error. Note that this may crash before reaching argtype[spec_no++] because we dereference some pointers in the meantime). If it is too verbose, coment out the 'printing argument' part, it may be a big array or something.
>
>> I've compiled it with only the --disable-threads flag now and I still get the same crash in the call to GC_init() in cl_boot(). However, staring the ECL interpreter works fine and embedding ECL into a single-threaded, small example program also works.
>
> Regarding working with threads enabled: ECL enviroment must b e imported on each "C++ world" thread (see examples for how to do that). That is not necessary on ECL with single thread build.
>
> Regarding GC_init – are you certain you do not call it twice for some reason? Or that cl_boot is not called twice? I mildly remember someone had a similar problem and it was due to calling GC_init separately before cl_boot (or immedietely after).
>>
>> Could it be that I am missing something when trying to embed ECL in a large C++ codebase? Do I have to worry about the Boehm GC not functioning when most of the program is not designed to use GC_MALLOC? I am also statically linking my lisp code, would that make a difference here?
>
> No, bdwgc should work fine with code which is not libgc aware. You may want to try using libgc shipped with your system. I don't know what your OS is, but OpenBSD has some heavy restrictions for what you can do with memory.
>>
>
>
> Regards,
> Daniel



More information about the ecl-devel mailing list