Hi,

> Basically my program looks like this:
> 
> static volatile sig_atomic_t child_terminated=0;
> 
> void sigchld_handler(int sig) {
>     int copy_errno=errno;
>     debug("Received SIGCHLD");
>     child_terminated=1;
>     signal(SIGCHLD,sigchld_handler);
>     errno=copy_errno;
> }
> 
> int main() {
>     signal(SIGCHLD,sigchld_handler);
>     for(;;) {
>         /* do some heavy weight stuff */
>         /* check for child_terminated and perform waitpid */
> }

You may also use sigaction() instead of signal(). It is more precise
mainly because of the sa_flags and sa_mask that you can provide in the
struct sigaction.
It is not necessary to restore the handler
(signal(SIGCHLD,sigchld_handler); in you handler code) as by default (if
you do not use flag SA_RESETHAND or SA_ONESHOT) the signal disposition
is untouched.

Anyway, I do agree with you : it is mostly possible that the debug()
call is the reason of your trouble, as no other action in your handler
can produce it.

The program I made years ago that produce the same effect that you are
experimenting is a massive forker() and as such has lots of childs
running concurrently. It worked fine for years, then arrived
troubles ...
The first I saw were due to the pthread dynamic library that was claimed
at run time (see ldd(1)). I had to change some code to avoid the use of
some functions that use other functions (that use ...) that finally
needed the pthread.
Then time after, I experimented SIGSEGV when using sprintf(3) and
finally had the FUTEX_WAIT syndrome.
In my mind (beware it is some years old) it began with the change of the
libc that was provided with the Linux Distro. Thus I suppose there is a
reason that can be found mainly in some implementation in the User
space.
Anyway, the fact is that removing the kind of fprintf() from the handler
solved all my troubles.

You can change my old survfutex to peek the data, and decrement it
before poking. Then you will know if poking a "1" would produce some
effect.
If you do this, please let me know if it worked too.

Regards.

-Rogers

> 
> Maybe the debug-call is the reason. It is sending the String to the local 
> syslog-daemon, using sockets and therefore a bunch of system calls. When I 
> consider strace, I see the arrival of SIGCHLD and the futex call directly 
> behind it. 
> When the futex call is performed, the 3rd argument is a "2". I verified it 
> using PTRACE_PEEKDATA on the 1st argument (which is the address of the futex 
> value). It really is a "2". Do you know what this "2" exactly means? Does it 
> mean one process blocked a ressource and another one is now suspended? What 
> happens if I write a "1" to the futex address?
> 
> Thanks a lot for answers ;-)
>

-- 
Futex hang when exiting using the window close button
https://bugs.launchpad.net/bugs/57731
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to