On Wed, 24 Feb 2010, 14:20 +0200, Kostik Belousov wrote: > On Wed, Feb 24, 2010 at 03:44:41AM -0800, Jeremy Chadwick wrote: > > On Wed, Feb 24, 2010 at 01:21:39PM +0200, Kostik Belousov wrote: > > > On Wed, Feb 24, 2010 at 06:53:59PM +1100, Peter Jeremy wrote: > > > > Updates following some off-line discussions and debugging with John on > > > > IRC. I've cc'd gshapiro@ because the problem appears to be sendmail, > > > > rather than the FreeBSD kernel. > > > > > > > > On 2010-Feb-23 12:35:22 +1100, John Marshall > > > > <john.marsh...@riverwillow.com.au> wrote: > > > > >Environment: sendmail 8.14.4 on FreeBSD 8.0-RELEASE-p2 > > > > > > > > Note that this is stock ISC sendmail, not the sendmail in either the > > > > base system or the port. > > > > > > > > >I posted about this in comp.mail.sendmail and was told... > > > > > > > > > >> sleep() should be one of these calls: > > > > >> > > > > >> if (njobs == 0 && WorkGrp[wgrp].wg_lowqintvl < > > > > >> MIN_SLEEP_TIME) > > > > >> sleep(MIN_SLEEP_TIME); > > > > >> else if (WorkGrp[wgrp].wg_lowqintvl <= 0) > > > > >> sleep(QueueIntvl > 0 ? QueueIntvl : MIN_SLEEP_TIME); > > > > >> else > > > > >> sleep(WorkGrp[wgrp].wg_lowqintvl); > > > > > > > > Whilst it's true that the code calls sleep(), it's not calling > > > > sleep(3) in the FreeBSD libc. Instead it's calling a sleep() defined > > > > in libsm/clock.c - which is a horrible maze of #ifdefs. > > > > > > > > John has pre-processed that code and the result it at: > > > > http://www.riverwillow.net.au/~john/sm/clock.preprocessed > > > > > > > > At a quick look, the code is broken: sm_seteventm() generates a > > > > one-off timer using setitimer(2), which will send SIGALRM when it > > > > expires. sm_releasesignal() then unblocks SIGALRM. In theory, the > > > > SIGALRM could be delivered anywhere after the (!SmSleepDone) test and > > > > before pause() is called - in which case, the signal is lost and > > > > pause() will sleep forever. > > > > > > > > On 2010-Feb-24 08:13:06 +1100, John Marshall > > > > <john.marsh...@riverwillow.com.au> wrote: > > > > >My ktrace file was created with 'ktrace -g 48501'. I have the result > > > > >of > > > > >'kdump -R -p 48504' available at: > > > > > > > > > > <http://www.riverwillow.net.au/~john/8_0/rwsrv04_201002240725.kdump.gz>
> Regarding sigsuspend() returning EINTR without delivering any signal, > could it be that the sendmail process was debugged ? No. I didn't touch the process with anything this time. There was no debugger in use on the system. That was how I found the process first thing this morning so I sent off the kdump output. The process stayed in the same state until I rebooted the system this afternoon to install a kernel with debug symbols and options. I have done the same on the other two servers, so I can dig deeper for you next time. I am running ktrace on the sendmail process group on all three servers waiting to catch the next one. By the way, all three are i386 with SMP. -- John Marshall
pgp5ZpjIqlW5M.pgp
Description: PGP signature