(I originally found this problem on a very busy FreeBSD 4.3 system
running with the bigtodo patch - it is much less likely to occur with
a standard qmail.)

First some background regarding trigger. When qmail-queue has a mail
for qmail-send it opens the named pipe, trigger, writes a byte to it,
closes trigger and exits.

qmail-send notices this trigger in the following loop:

        open trigger
        select: is trigger readable?
        ...
        todo_do()
        ...
        close trigger
        open trigger
        ...
        select: is trigger readable
        etc.

A couple of notes on this loop:

o The todo_do() involves a potentially expensive directory scan - if
  lots of injections are occuring or if you use the bigtodo patch.

o The idea behind closing and opening trigger is to flush the byte
  written by qmail-queue so that next time around the loop the select
  blocks until another qmail-queue comes along.

The problem I've found relates to when the flush occurs on a named
pipe. At least on FreeBSD, a named pipe is only flushed when no other
process has the pipe opened.

On a very busy system the chance of this occuring reduces as there is
almost always one or more qmail-queue processes running. Futhermore
the code order of qmail-send is such that the window in which no
qmail-queue process can exist is very very small. It's the tiny window
between the close that immediately precedes the open in trigger_set().

The degenerate case I see is that qmail-send starts spinning on the
select()--todo_do() loop as select() always indicates that the trigger
is readable. This spin involves a directory scan of todo which slows
the qmail-queue processes as they too are writing to the same
directory/file system. Since the qmail-queue processes are further
slowed, qmail-send continues to spin on a readable trigger.

In other words, in the tiny window that qmail-send leaves for the
kernel to flush the pipe, there is always at least one qmail-queue
process with the trigger open. Ergo a resource burning spin that
degenerates if the injection rate is high and regular (exactly the
situation for the servers I noticed this on).

Returning to the bigtodo patch, that of course exacerbates the
situation as the window between the close and open in trigger_set
forms an even smaller part of the loop.

Fortunately there are a couple of remedies.

At the very least, the flush window can be made substantially larger
by closing trigger as soon as the select returns.

A second and more defensive measure is to issue a non-blocking read on
the pipe to drain all qmail-queue bytes *prior* to the todo
scan. Perhaps both of these could be done in the trigger_pull
routine. I've appended a patch that gives the idea in code (it's
untested).

Question: has anyone else seen this? You most likely will only see it
on a very busy system that has bigtodo.


Regards.

*** trigger.orig.c      Mon Jun 15 03:53:16 1998
--- trigger.c   Wed Jul 25 16:50:40 2001
***************
*** 1,4 ****
--- 1,5 ----
  #include "select.h"
+ #include "ndelay.h"
  #include "open.h"
  #include "trigger.h"
  #include "hasnpbg1.h"
***************
*** 36,41 ****
  int trigger_pulled(rfds)
  fd_set *rfds;
  {
!  if (fd != -1) if (FD_ISSET(fd,rfds)) return 1;
   return 0;
  }
--- 37,55 ----
  int trigger_pulled(rfds)
  fd_set *rfds;
  {
!  char buf[64];
! 
!  if ((fd != -1) && FD_ISSET(fd,rfds))
!   {
!    ndelay_on(fd);
!    while (read(fd,buf,sizeof(buf)) > 0) ;
!    close(fd);
!    fd = -1;
! #ifdef HASNAMEDPIPEBUG1
!    if (fdw != -1)
!      close(fdw);
! #endif
!    return 1;
!   }
   return 0;
  }

Reply via email to