On Fri, Oct 16, 2015 at 4:39 PM, Ludovic Courtès <l...@gnu.org> wrote: > I’m reporting the problem and (hopefully) the solution, but I think we’d > better double-check this. > > The problem: Running the test below in a loop sometimes gets a SIGSEGV > in the child process (on x86_64, libc 2.22.) > > --8<---------------cut here---------------start------------->8--- > (use-modules (guix build syscalls) (ice-9 match)) > > (match (clone (logior CLONE_NEWUSER > CLONE_CHILD_SETTID > CLONE_CHILD_CLEARTID > SIGCHLD)) > (0 > (throw 'x)) ;XXX: sometimes segfaults > (pid > (match (waitpid pid) > ((_ . status) > (pk 'status status) > (exit (not (status:term-sig status))))))) > --8<---------------cut here---------------end--------------->8--- > > Looking at (guix build syscalls) though, I see an ABI mismatch between > our definition and the actual ‘syscall’ C function, and between our > ‘clone’ definition and the actual C function. > > This leads to the attached patch, which also fixes the above problem for me. > > Could you test this patch?
The patch looks good. Thanks for catching this! > Now, there remains the question of CLONE_CHILD_SETTID and > CLONE_CHILD_CLEARTID. Since we’re passing NULL for ‘ctid’, I expect > that these flags have no effect at all. I added those flags in commit ee78d02 because they solved a real issue I ran into. Adding those flags made 'clone' look like a 'primitive-fork' call when examined with strace. > Conversely, libc uses these flags to update the thread ID in the child > process (x86_64/arch-fork.h): > > --8<---------------cut here---------------start------------->8--- > #define ARCH_FORK() \ > INLINE_SYSCALL (clone, 4, > \ > CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID | SIGCHLD, 0, > \ > NULL, &THREAD_SELF->tid) > --8<---------------cut here---------------end--------------->8--- > > This is certainly useful, but we’d have troubles doing it from the FFI… > It may that this is fine if the process doesn’t use threads. Right, so here's what 'primitive-fork' does: clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fc5398cea10) = 13247 Here's what 'clone' does: clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0) = 14038 In practice it may not be a problem since most of the time you'd 'exec' after cloning. Is there any reliable way to get a hold of whatever THREAD_SELF is? I wish the libc 'clone' function didn't have that silly callback and behaved like 'fork', then we could have avoided these issues altogether. - Dave