>
>
>> -----Original Message-----
>> From: Simon Byrnand [mailto:[EMAIL PROTECTED]
>> A 0.5 second delay scanning a message means exactly that -
>> that your email
>> will arrive 0.5 seconds later. No sooner, no later. How you
>> figure that you
>> have to multiply that by number of users they have I have no
>> idea... :)
> ---
>       That's .5 seconds of delay once I get access to a free processor
> to run the filter.  If the filter is processing 100,000 other email
> simultaneously and each is taking .5 seconds to process, tha's 50,000
> seconds, or a butt-load of time.  Now if they have 100,000 CPU's each
> with it's own disk, then you're right, all of those 100,000 messages
> will be processed in parallel in the same .5 seconds.  But if you only
> have
> 10 servers processing 50,000 seconds worth of email, that (assuming
> perfect queuing and such) means it will take 5000 seconds minimimum --
> best case time.
>
>       That .5 seconds isn't seconds delayed from email in to email
> out, it's .5 seconds for the filter to process my 1 email.  That's
> why total average delay = (ave delay/email)*(#emails to process)/#CPU's.
>
> It's easy to see with swapping overhead and queuing algorithms that
> 5000 seconds (~1 hour, 23 minutes total delay for each email).

Well its clear that you don't run a mail server yourself, as theres a lot
of false assumptions in your theory, and I'm sure that any one of a number
of people on the list here that run SA on high-volume mailservers will
agree with me.

First of all I dare you to find *any* ISP that is processing 100,000
emails *simultaneously* (eg at the very same instant) on one server. Can't
happen. In fact its so ludicrously impossible and exagerated that your
example is nonsensical. Even across a dozen machines 100,000 at one
instant isn't practical simply due to process and memory limits.

Secondly your assumption that because a single message takes 0.5 seconds
to process means that ten simultaneous emails would take 5 seconds and a
hundred would take 50 seconds is just that - an assumption, and an
incorrect to boot.

It assumes that the scanning process over the total time it takes is CPU
bound and in fact on a fast machine its not, the majority of that time is
taken up by network tests - the RBL checks and razor checks in particular.

I find on our server that the scanning time when using spamc with all
network tests turned off is under 0.1 seconds, and it increases to between
0.6 and 2 seconds with network tests enabled. This means that the CPU
bound part of the scanning was completed in less than 0.1 seconds and for
the rest of the time the process just slept waiting for the network test
responses to come back.

CPU time != real time.

This means that if the actual CPU utilization was 0.1 seconds then 10
simultaneous scans started at the same time would take about 1.5 seconds -
eg 0.1 * 10 plus the typical latency of the network tests of about 0.5
seconds. Not 5 seconds as your theory suggests.

Of course at some point as you increase the number of simultaneous scans,
you will reach a point where you *are* CPU bound, and also there is a
small overhead in the task switching, but that all depends on the speed of
the machine, of course...there is an optimum number of simultaneous scans
to get maximum message/day throughput, depending on the specs of the
machine.

And if your server is going into swap then you're in serious trouble, this
is not a normal situation, so it doesn't really count....

With most setups if the total scanning time takes longer than 5 to 10
seconds the machine can get into serious trouble as the number of
processes for a given incomming message rate builds up to dangerous
levels, as some on this list (including me) have found out the hard way.
(The secret to achieving maximum throughput is limiting the concurancy to
a level that matches the server's CPU speed and physical memory, and
starting a new job as each one finishes)

When most people on the list here are concerned about scans taking as long
as 5 seconds to complete, your suggestion that people could be waiting
hours for the messages to arrive due to spam filtering is laughable.

Regards,
Simon



-------------------------------------------------------
This SF.net email is sponsored by:  Etnus, makers of TotalView, The best
thread debugger on the planet. Designed with thread debugging features
you've never dreamed of, try TotalView 6 free at www.etnus.com.
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to