on Fri, Oct 25, 2002 at 10:44 PM +0200, Jan Korger ([EMAIL PROTECTED]) wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hi,
> I installed SA on Debian woody to filter my POP3 email. Therefore
> fetchmail is used, forwarding fetched mail to localhost:smtp. Thus,
> exim will be run through inetd. Exim will then find .procmailrc in my
> home directory and run procmail. Procmail will (besides some very
> basic filtering) invoke SA and pgpenvelope_decrypt (for verifying PGP
> sigs with gpg) on any message received.
> 
> I understand that this is not an optimal setup for receiving huge amounts
> of mail, but

>    a) my mail traffic is very low (100/day) -- just that all of them will be
>       received after booting the machine all at once
>
>    b) this is the default setup for debian *stable* and what everyone
>    will try first

> Please do always consider people using SA cannot know SA is likely to
> overload their machine and noone has any control over the amount of
> email received. 

This isn't SA's job.  SA filters mail thrown at it.  You decide how much
mail to toss.

I *would* consider filing a bugreport against fetchmail (which I *know*
has throttling capabilities) and/or exim (which should but I haven't
experimented yet), requesting that reasonable throttling values be
chosen by default.  Not sure the suggestion will be adopted, but it fits
with the "it just works" attitude of Debian.  xinetd (a replacement for
inetd) may also provide throttling capabilities, though I'm not overly
familiar with it.



The salient portion of your post actually comes later, I'll move the
answers forward.

My own processing situations are somewhat similar to yours.  Home use is
56k dialup on a PIII-600 386MB system, with typical mail loads of
100-500 messages when initiating a dialup connection.  Work is T1 LAN
with ~1000+ messages daily, usually polled every several minutes, but
occasional far higher batch fetches following service interruptions.

> Some further thoughts on running SA:
> 
> 1. Is there a way to limit the number of messages processed by an MTA
>    or fetchmail. (This is not what you want on a mail hub but this is
>    exactly what you want on a machine which's main purpose is NOT mail
>    processing)

RTFM:

  - In exim, run daemon mode (rather than via inetd), you'll want to
    modify the following value in /etc/exim/exim.conf:

       smtp_accept_queue_per_connection = NNN

    ...where NNN is some value.  On an Athlon 1900/512MB system, I set
    this to '50'.  Exim will spawn a delivery process when this limit is
    hit.  The configuration file is extensively documented, else see
    /usr/share/doc/exim.

  - In fetchmail, the fetchlimit, batchlimit, and expunge options can be
    used to limit the amount of mail dropped on your delivery queue, the
    amount of mail fetched in a single pass, and the number of messages
    fetched between expunging (deleting) mail from your server.  I have
    the following values in /etc/fetchmailrc (equivalent options exist
    for ~/.fetchmailrc):

        defaults:
          antispam -1 
          batchlimit 100
          fetchlimit 200
          expunge 50

    This is covered extensively in the fetchmail manpage.


> 2. Limit the number of SA processes (and therefore the number of mails
>    processed) to 1 [or very low]. This could be done by writing any
>    mail to a named pipe (or maybe regular file mailbox; called
>    "to_be_checked") and make sure one instance of a program (called
>    "spamd2") runs. 

IMO this is counterproductive (high-volume sites will want multiple
spamcs running) and the wrong place to apply a throttle.  If you
observer your process table, I suspect you'll find that this isn't where
the bulk of process entries are anyway.


> 3. spamd can be configured to limit the number of messages processed
>    at a time (GOOD) but only a certain number of messages can be
>    queued (VERY BAD) due to some system restrictions concerning
>    sockets. Think of a different way of queing?



<...>

> So, what happened?  Well, I started my machine after ~20h of not using
> it nor reading mail through any other box. 

...lots of mail downloaded via fetchmail.  Lots of exim processes
started.  Lots of spamc processes kicked off.  I've seen > 800
processes, and load > 20 on a single-user workstation under similar
configurations.  Downloading some 3800 messages the other day was
particularly annoying, especially as fetchmail wasn't expunging
already-fetched messages, leading to a local queue considerably _larger_
than 3800 messages.

While I feel your pain, you're pinning the blame on the wrong party if
you expect the answer to come from SA.  It is an inline filtering tool,
_not_ a mail queue management system.  That's the job of your local MTA
(which spools messages for local delivery) and your MDA (fetchmail --
which grabs scads o' messages from a remote server for local delivery).

> fetchmail flushed the messages on the POP3 server. Is there any chance
> in restoring the lost mail or finding out _who_ emailed me? 

No.  If the mail got lost, it got lost.

Process logs may indicate headers, including sender.  Check
/var/log/mail.info.

It's also possible your mail got dumped to /var/spool/mail/$USER, where
$USER is your username.   Poke around here.

> I guess they got lost by this strange "killing processes" "feature" of
> linux as fetchmail does only flush messages successfully forwarded.
> This "feature" is not your fault but you need to take it into
> consideration.



> I understand that I can significantly reduce the load by using
> spamc/spamd but for small sites (especially single user workstations)
> that kind of a setup is sub-optimal (and BTW not recommend by any SA
> documentation) as it requires an additional deamon to be running and
> waste (virtual) memory, which won't have a lot to do.  

Wasted virtual memory sits on your swap partition and doesn't hurt
anything.  The performance enhancement is significant.  I'd recommend
spamc/spamd.

> Besides, running spamc will probably cause the same problems as
> spamassassin 

...in that it dumps mail more rapidly onto your MDA?  Possibly, but I
suspect the bottleneck effect of spamassassin (rather than spamd/spamc)
is actually *increasing* your system resource utilization (file handles,
process table entries, memory).  This is a case where faster is better.


> Using nice priorities does not help at all, I did run procmail (nice
> -n 5) and fetchmail (6) nicely.

Priority isn't going to save your process table or memory, which is
where your problems were.  In fact this will likely make things worse as
slower spam processing means more queued up exim and procmail processes.


Cheers.

-- 
Karsten M. Self                                          [EMAIL PROTECTED]
FreeRun Technologies                               Sr. Systems Administrator
vox 707.265.1836 x121
http://www.freeruntech.com

  "Gort, klaatu nikto barada."
  -- The Day the Earth Stood Still


-------------------------------------------------------
This sf.net email is sponsored by: See the NEW Palm 
Tungsten T handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to