On Tue, Mar 24, 2009 at 08:54:17AM -0400, Brandon Hilkert wrote:

>> MEASURE! Find out what is slowing it down. When you know what that
>> is, ask the question again.
>
> Here's a snapshot of top during a peak testing period
>
> top - 08:45:55 up 4 days, 23:47,  1 user,  load average: 2.75, 1.70, 0.73
> Tasks:  93 total,   1 running,  92 sleeping,   0 stopped,   0 zombie
> Cpu(s): 32.2%us, 10.4%sy,  0.0%ni, 47.3%id,  6.8%wa,  0.5%hi,  2.8%si, 0.0%st
> Mem:   2075040k total,   810116k used,  1264924k free,    35368k buffers
> Swap:  2650684k total,        0k used,  2650684k free,   612544k cached
>
>  PID USER          PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+     COMMAND
> 2191 dkim-fil      20   0 98472 3420 1516 S   19   0.2 3:29.90 dkim-filter
> 2182 dk-filte      20   0 72536 2936 1236 S   11   0.1 2:33.44   dk-filter

Is there no combined milter that will handle both signatures?

> From atop:
>
> CPU | sys   19% | user   64% | irq     5% | idle     96% | wait 16% |
> cpu | sys   11% | user   31% | irq     3% | idle     45% | cpu001 w 9% |
> cpu | sys    8% | user   33% | irq     2% | idle     50% | cpu000 w 7% |

You may have spare CPU cycles, what happens when you furhter raise the
input concurrency? If count log events of each of the 4 types:

        ... smtpd[<pid>]: <queueid>: client=...
        ... cleanup[<pid>]: <queueid>: message-id=...
        ... qmgr[<pid>]: <queueid>: from=...
        ... <delivery-agent>[<pid>]: <queueid>: to=...

over consecutive 10s intervals during a test run, what are the raw
numbers:

           time    smtpd   cleanup         qmgr    agent
        ------------------------------------------------
        HH:MM:00   ?????     ?????        ?????    ?????
        HH:MM:10   ?????     ?????        ?????    ?????
        HH:MM:20   ?????     ?????        ?????    ?????
        HH:MM:30   ?????     ?????        ?????    ?????
        HH:MM:40   ?????     ?????        ?????    ?????
        HH:MM:50   ?????     ?????        ?????    ?????
        HH:MN:00   ?????     ?????        ?????    ?????
        ....

Examine the same table with different (input) concurrency levels
in the "injector" (smtp-source?) and different combinations of
no-milters/milter-A/milter-B/milter-A+B.

Capture I/O ops for each 10s interval during this time, how do they
compare for the various cases and over time in each case.

How big are the messages sent with and without DKIM signatures?
Look at the "qmgr" log entry which reports the message size.

Is the incoming queue growing? The active queue? Or is it just
insufficient throughput via smtpd+cleanup into the incoming queue.

Is your smtp-source getting starved of CPU slots? Consider running
the source off-box.

"Measure" means start getting quantitative feel for the performance
under a variety of conditions.

You need to start getting detailed data, not crude aggregate CPU
numbers... Your Postfix logs are the best source for this, but
running a parallel "iostat -x 5" or similar to capture disk events
is a good idea.

-- 
        Viktor.

Disclaimer: off-list followups get on-list replies or get ignored.
Please do not ignore the "Reply-To" header.

To unsubscribe from the postfix-users list, visit
http://www.postfix.org/lists.html or click the link below:
<mailto:majord...@postfix.org?body=unsubscribe%20postfix-users>

If my response solves your problem, the best way to thank me is to not
send an "it worked, thanks" follow-up. If you must respond, please put
"It worked, thanks" in the "Subject" so I can delete these quickly.

Reply via email to