I just happened to notice I'm getting a 3-4 minute delay on occasion between SA 
scanning a msg and it being piped back into Postfix.  Here's one such complete 
transaction.  Note spamd processing takes 10 seconds, then we have a ~4 minute 
delay.  For 95% of these SA deliveries the delay is less than 5 seconds.  I see 
maybe a dozen or so out of ~1000 with delays between 80s and 226s.

Feb  8 22:10:40 greer postfix/smtpd[20324]: connect from 
vger.kernel.org[209.132.180.67]
Feb  8 22:10:40 greer postfix/smtpd[20324]: A20A86C0A7: 
client=vger.kernel.org[209.132.180.67]
Feb  8 22:10:40 greer postfix/cleanup[20315]: A20A86C0A7: 
message-id=<5115cc02.2010...@hardwarefreak.com>
Feb  8 22:10:40 greer postfix/qmgr[748]: A20A86C0A7: 
from=<linux-raid-ow...@vger.kernel.org>, size=22590, nrcpt=1 (queue active)
Feb  8 22:10:40 greer spamd[17319]: spamd: connection from localhost 
[127.0.0.1] at port 44140
Feb  8 22:10:40 greer spamd[17319]: spamd: setuid to nobody succeeded
Feb  8 22:10:40 greer postfix/smtpd[20324]: disconnect from 
vger.kernel.org[209.132.180.67]
Feb  8 22:10:40 greer spamd[17319]: spamd: processing message 
<5115cc02.2010...@hardwarefreak.com> for nobody:65534
Feb  8 22:10:51 greer spamd[17319]: spamd: clean message (0.8/4.2) for 
nobody:65534 in 10.6 seconds, 22153 bytes.
Feb  8 22:10:51 greer spamd[17319]: spamd: result: . 0 - BAYES_50 
scantime=10.6,size=22153,user=nobody,uid=65534,required_score=4.2,rhost=localhost,raddr=127.0.0.1,rport=44140,mid=<5115cc02.2010...@hardwarefreak.com>,bayes=0.500000,autolearn=disabled
Feb  8 22:14:26 greer postfix/pipe[20325]: A20A86C0A7: 
to=<s...@hardwarefreak.com>, relay=spamassassin, delay=226, 
delays=0.47/0.02/0/226, dsn=2.0.0, status=sent (delivered via spamassassin 
service)
Feb  8 22:14:26 greer postfix/qmgr[748]: A20A86C0A7: removed
Feb  8 22:14:26 greer postfix/pickup[20224]: 6A4F56C0A7: uid=65534 
from=<linux-raid-ow...@vger.kernel.org>
Feb  8 22:14:26 greer postfix/cleanup[20348]: 6A4F56C0A7: 
message-id=<5115cc02.2010...@hardwarefreak.com>
Feb  8 22:14:26 greer postfix/qmgr[748]: 6A4F56C0A7: 
from=<linux-raid-ow...@vger.kernel.org>, size=22907, nrcpt=1 (queue active)
Feb  8 22:14:26 greer dovecot: deliver(stan): sieve: 
msgid=<5115cc02.2010...@hardwarefreak.com>: stored mail into mailbox 
'1-Linux-RAID'
Feb  8 22:14:26 greer postfix/local[20350]: 6A4F56C0A7: 
to=<s...@hardwarefreak.com>, relay=local, delay=0.19, delays=0.1/0.03/0/0.06, 
dsn=2.0.0, status=sent (delivered to command: /usr/lib/dovecot/deliver)
Feb  8 22:14:26 greer postfix/qmgr[748]: 6A4F56C0A7: removed

/etc/postfix/master.cf setup:

# ==========================================================================
# service type  private unpriv  chroot  wakeup  maxproc command + args
#               (yes)   (yes)   (yes)   (never) (100)
# ==========================================================================
smtp      inet  n       -       -       -       20      smtpd
        -o content_filter=spamassassin
pickup    fifo  n       -       -       60      1       pickup
        -o content_filter=
...
spamassassin unix -     n       n       -       -       pipe
        user=nobody argv=/usr/bin/spamc -f -e
        /usr/sbin/sendmail -oi -f ${sender} ${recipient}


There was nothing fired via cron when this msg came through, nothing eating CPU 
or disk at the time that I could find.  This host is idle 99% of the time, less 
than 2000 smtp connects/day on average, most of those rejected outright.  Due 
to the low load I only run two spamd processes, a master and one child.  So I 
have only one spamd that actively processes msgs:

22396 root      35  15 35908  32m 1208  30m 2592 S    0  8.6   3:54.43 spamd
22395 root      35  15 33904  30m 1208  28m 2576 S    0  8.1   0:23.41 spamd

Could someone kindly point to where I need to look next to identify the cause 
of these intermittent huge delays?  On an idle box nothing should take 4 
minutes.  It's likely an SA problem, but I wanted to ask here first as I'm no 
SA guru.  I installed it once, within the past year, tweaked the setup to fit 
my needs, and haven't touched it since.  Until discovering this it's been 
running without issue, though y catch rate isn't what I'd like, but that's 
another issue entirely.

Thanks.

-- 
Stan

Reply via email to