Hello Daniel,

thanks for taking the time to look at this. After taking a deep dive debugging 
and finally being able to reproduce the problem, I quickly realized that I had 
caused all this havoc by mixing C++ and C semantics all in the pursuit of 
trying to save myself from writing one more line of code...

Long story short, I was somehow piecing together cl_objects from other 
cl_objects which had already been "destroyed" by a transient std::vector I was 
using somewhere. Any code that worked with these objects worked most of the 
time (calling free(...) does not mean the things you are free-ing are not in 
memory anymore...), but crashed every once in a while since it was working with 
free'd memory.

Since I fixed my main problem I haven't had the time to investigate the issue 
with the single-threaded builds I have. Maybe It'll come up some other day, it 
seems to be working just fine outside of my program.

Anyways, thanks for your help!

Dennis

Daniel Kochmański <dan...@turtleware.eu> writes:

>> [...] there are multiple more calls to Lxxstore_object() methods below this
>>
>> I am having problems debugging this because I highly doubt that the generic 
>> function dispatch mechanism is broken (otherwise *nothing ever* would work, 
>> right?) So I think something else is causing this confusion in 
>> fill_spec_vector.
>
> It is hard to tell anything without a reproducible test case I could use. 
> Please replace the if/else in fill_spec_vector with:
>
> <<<EOF
>
>     if (ECL_LISTP(spec_type) &&
>         !Null(eql_spec = ecl_memql(args[spec_position], spec_type))) {
>       argtype[spec_no++] = eql_spec;
>     } else {
>       printf("XXX: args: %p, spec-pos: %d, args[sp]: %p\n", args, 
> spec_position, args[spec_position]);
>       printf("XXX: printing argument\n");
>       ecl_print(args[spec_position], ECL_T);
>       ecl_terpri(ECL_T);
>       printf("XXX: printing argument type\n");
>       ecl_print(ecl_type_to_symbol(ecl_t_of(args[spec_position])), ECL_T);
>       ecl_terpri(ECL_T);
>       printf("XXX: debug information done\n");
>       argtype[spec_no++] = cl_class_of(args[spec_position]);
>     }
>
> EOF
>
> it could be that the dispatch mechanism misses one particular type, or that 
> you have a dangling pointer, I wouldn't be so sure that all works correct. 
> Please compile ECL with this debug information and when you reproduce the 
> issue send the console output before the error. Note that this may crash 
> before reaching argtype[spec_no++] because we dereference some pointers in 
> the meantime). If it is too verbose, coment out the 'printing argument' part, 
> it may be a big array or something.
>
>> I've compiled it with only the --disable-threads flag now and I still get 
>> the same crash in the call to GC_init() in cl_boot(). However, staring the 
>> ECL interpreter works fine and embedding ECL into a single-threaded, small 
>> example program also works.
>
> Regarding working with threads enabled: ECL enviroment must b e imported on 
> each "C++ world" thread (see examples for how to do that). That is not 
> necessary on ECL with single thread build.
>
> Regarding GC_init – are you certain you do not call it twice for some reason? 
> Or that cl_boot is not called twice? I mildly remember someone had a similar 
> problem and it was due to calling GC_init separately before cl_boot (or 
> immedietely after).
>>
>> Could it be that I am missing something when trying to embed ECL in a large 
>> C++ codebase? Do I have to worry about the Boehm GC not functioning when 
>> most of the program is not designed to use GC_MALLOC? I am also statically 
>> linking my lisp code, would that make a difference here?
>
> No, bdwgc should work fine with code which is not libgc aware. You may want 
> to try using libgc shipped with your system. I don't know what your OS is, 
> but OpenBSD has some heavy restrictions for what you can do with memory.
>>
>
>
> Regards,
> Daniel

Reply via email to