Jesus Climent wrote:

> Anyone's comments about the upload of the c code in the cvs?

It's interesting, but given how quickly spammers adapt and change, I
think trying to keep up with them using C as the implementation
language will be problematic.  It seems like any solution that loses
the flexibility of Perl won't be worth it except for high volume
sites.

I'm biased because I'm not at any high volume sites, but I think it
would be better to focus on general speed enhancements.  Restructure
things to be faster, call out to C code from Perl when necessary for
performance reasons, profile the code more, generate equivalent data
fewer times (when it makes sense), fewer passes, etc.  But, keep the
focus on killing as much spam as possible (without false positives).

A few things off the top of my head:

  - Limit the a maximum amount of data going into body tests (max
    number of lines and/or max number of characters).  I've had some
    large legitimate messages take forever (something like 2 minutes
    for a 2 MB message) to go through SA.  Maybe snip the middle
    so you always get the beginning and end of a message.
  - Order tests better.  Do all tests of one type together (better for
    cache usage).  (Beats me how much this will help, but I'm curious.)
  - More work on early exit.  Stop testing once you know enough,
    run the expensive tests last.

Dan

_______________________________________________________________

Have big pipes? SourceForge.net is looking for download mirrors. We supply
the hardware. You get the recognition. Email Us: [EMAIL PROTECTED]
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to