I'm seeing an infinite loop that can be traced to a signal handler in the uthread module. I'm using a snapshot of CURRENT from 2002-01-09.
Repro: Write the classic hello world program. When you build it, link in libc_r. Use a shell script to execute it over and over in a tight loop. This works on my box (using zsh): # echo 'main() { printf("Hello World!\\n"); }' > hello.c # gcc -o hello hello.c -lc_r # while [ 1 ]; do ./hello; done Then hold down CTRL^T at the console. Within a few seconds, the "Hello World"'s will stop getting printed out. CPU usage will climb to around 98%. At that point, you can attach a debugger and see that the SIGINFO was caught by _thread_sig_handler(). You can also see that _thread_init() was not finished yet when the signal was raised. Most of the stack doesn't look correct to me, but I think that _thread_dump_info() gets called, which calls snprintf(3), which has a helper that calls _thread_init() again. Somewhere inside this nested _thread_init call the process might end up in a spin lock that is locked against itself. Ha ha! Hello World gets into an infinite loop! Obviously this bug can be reproduced with any program that uses the uthread module. My own tests show that the attached patch to the _thread_init function fixes the problem. I just moved the registration of the signal handler to a spot _after_ where the data used by the handler had been initialized. I don't know what the repercussions are for messing with this part of the thread library. Does this patch look safe to anybody else? (I'm not suggesting it get committed, but I would like to know what might go wrong if I use it on my own source). Do you know what the correct fix is? (I know 4.3-STABLE didn't have this bug, and the registration of the signal handler hasn't changed since then). -- chad
--- src/lib/libc_r/uthread/uthread_init.c.orig Mon Nov 4 17:21:24 2002 +++ src/lib/libc_r/uthread/uthread_init.c Tue Nov 5 10:59:49 2002 @@ -349,6 +349,59 @@ TAILQ_INSERT_HEAD(&_thread_list, _thread_initial, tle); _set_curthread(_thread_initial); + /* Get the kernel clockrate: */ + mib[0] = CTL_KERN; + mib[1] = KERN_CLOCKRATE; + len = sizeof (struct clockinfo); + if (sysctl(mib, 2, &clockinfo, &len, NULL, 0) == 0) + _clock_res_usec = clockinfo.tick > CLOCK_RES_USEC_MIN ? + clockinfo.tick : CLOCK_RES_USEC_MIN; + + /* Get the table size: */ + if ((_thread_dtablesize = getdtablesize()) < 0) { + /* + * Cannot get the system defined table size, so abort + * this process. + */ + PANIC("Cannot get dtablesize"); + } + /* Allocate memory for the file descriptor table: */ + if ((_thread_fd_table = (struct fd_table_entry **) +malloc(sizeof(struct fd_table_entry *) * _thread_dtablesize)) == NULL) { + /* Avoid accesses to file descriptor table on exit: */ + _thread_dtablesize = 0; + + /* + * Cannot allocate memory for the file descriptor + * table, so abort this process. + */ + PANIC("Cannot allocate memory for file descriptor table"); + } + /* Allocate memory for the pollfd table: */ + if ((_thread_pfd_table = (struct pollfd *) malloc(sizeof(struct +pollfd) * _thread_dtablesize)) == NULL) { + /* + * Cannot allocate memory for the file descriptor + * table, so abort this process. + */ + PANIC("Cannot allocate memory for pollfd table"); + } else { + /* + * Enter a loop to initialise the file descriptor + * table: + */ + for (i = 0; i < _thread_dtablesize; i++) { + /* Initialise the file descriptor table: */ + _thread_fd_table[i] = NULL; + } + + /* Initialize stdio file descriptor table entries: */ + for (i = 0; i < 3; i++) { + if ((_thread_fd_table_init(i) != 0) && + (errno != EBADF)) + PANIC("Cannot initialize stdio file " + "descriptor table entry"); + } + } + /* Initialise the global signal action structure: */ sigfillset(&act.sa_mask); act.sa_handler = (void (*) ()) _thread_sig_handler; @@ -410,59 +463,6 @@ /* Get the process signal mask: */ __sys_sigprocmask(SIG_SETMASK, NULL, &_process_sigmask); - - /* Get the kernel clockrate: */ - mib[0] = CTL_KERN; - mib[1] = KERN_CLOCKRATE; - len = sizeof (struct clockinfo); - if (sysctl(mib, 2, &clockinfo, &len, NULL, 0) == 0) - _clock_res_usec = clockinfo.tick > CLOCK_RES_USEC_MIN ? - clockinfo.tick : CLOCK_RES_USEC_MIN; - - /* Get the table size: */ - if ((_thread_dtablesize = getdtablesize()) < 0) { - /* - * Cannot get the system defined table size, so abort - * this process. - */ - PANIC("Cannot get dtablesize"); - } - /* Allocate memory for the file descriptor table: */ - if ((_thread_fd_table = (struct fd_table_entry **) malloc(sizeof(struct fd_table_entry *) * _thread_dtablesize)) == NULL) { - /* Avoid accesses to file descriptor table on exit: */ - _thread_dtablesize = 0; - - /* - * Cannot allocate memory for the file descriptor - * table, so abort this process. - */ - PANIC("Cannot allocate memory for file descriptor table"); - } - /* Allocate memory for the pollfd table: */ - if ((_thread_pfd_table = (struct pollfd *) malloc(sizeof(struct pollfd) * _thread_dtablesize)) == NULL) { - /* - * Cannot allocate memory for the file descriptor - * table, so abort this process. - */ - PANIC("Cannot allocate memory for pollfd table"); - } else { - /* - * Enter a loop to initialise the file descriptor - * table: - */ - for (i = 0; i < _thread_dtablesize; i++) { - /* Initialise the file descriptor table: */ - _thread_fd_table[i] = NULL; - } - - /* Initialize stdio file descriptor table entries: */ - for (i = 0; i < 3; i++) { - if ((_thread_fd_table_init(i) != 0) && - (errno != EBADF)) - PANIC("Cannot initialize stdio file " - "descriptor table entry"); - } - } } /* Initialise the garbage collector mutex and condition variable. */