summary: there seems to be a bug in RLIMIT_SIGPENDING accounting that can cause it to go negative. associated with this fact, the given process may get stuck forever trying to enter a 'clone' syscall.
long version: - several people have experienced this problem of Xorg hanging forever (100% cpu usage) trying to enter the 'clone' syscall to execute xkbcomp. - the syscall is aborted with ERESTARTNOINTR because there is a SIGALRM signal pending. status shows: SigQ: 1/18446744073709551615 SigPnd: 0000000000000000 ShdPnd: 0000000000002000 SigBlk: 0000000000000000 SigIgn: 0000000000301000 SigCgt: 0000000061c06ecb note the weird SigQ value, is 64 bits' -1 for RLIMIT_SIGPENDING. - the signal handler is executed (as confirmed under gdb). - kernel will then force reentering the syscall by means of the following code in handle_signal(): case -ERESTARTNOINTR: regs->rax = regs->orig_rax; regs->rip -= 2; break; - this effectively puts user space in a kind of spinlock that never ends. - the code that sets signal handler is quoted here from Xorg gitweb: 1529 #define SMART_SCHEDULE_SIGNAL SIGALRM (...) 1588 bzero ((char *) &act, sizeof(struct sigaction)); 1589 1590 /* Set up the timer signal function */ 1591 act.sa_handler = SmartScheduleTimer; 1592 sigemptyset (&act.sa_mask); 1593 sigaddset (&act.sa_mask, SMART_SCHEDULE_SIGNAL); 1594 if (sigaction (SMART_SCHEDULE_SIGNAL, &act, 0) < 0) 1595 { 1596 perror ("sigaction for smart scheduler"); 1597 return FALSE; 1598 } - the code that sets the timer is quoted here from Xorg gitweb: 1548 Bool 1549 SmartScheduleStartTimer (void) 1550 { 1551 #ifdef SMART_SCHEDULE_POSSIBLE 1552 struct itimerval timer; 1553 1554 SmartScheduleTimerStopped = FALSE; 1555 timer.it_interval.tv_sec = 0; 1556 timer.it_interval.tv_usec = SmartScheduleInterval * 1000; 1557 timer.it_value.tv_sec = 0; 1558 timer.it_value.tv_usec = SmartScheduleInterval * 1000; 1559 return setitimer (ITIMER_REAL, &timer, 0) >= 0; 1560 #endif 1561 return FALSE; 1562 } - having this negative rlimit may cause problem to the __sigqueue_alloc() kernel function. however, as far as i can see, this would possibly prevent new signals from being enqueued - not existing ones from being dequeued/cleared/whatever. - bugzilla entry for the complete investigation can be seen here: https://bugs.freedesktop.org/show_bug.cgi?id=10525 thanks, Miguel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/