RE: SARE_FRAUD vs SURBLs (Was: RE: Mass-check errors)

Smart,Dan 9 Sep 2004 16:40:00 -0000

Chris, I followed the process documented in ...
http://wiki.apache.org/spamassassin/ProfilingRulesWithDprof


I used the Dprof with SpamAssassin, as I couldn't get Dprof to work with
mass-check without a Segmentation Fault.  

For testing, I create a Maildir with messages that took longer than 30
seconds to scan.  My goal was to figure out why these were going so slow.  

I then did a 
Perl -d:Dprof /usr/bin/spamassassin < testfile
And then ran the profiler as described on the Wiki.
Dprofpp complained about a garbled profile, which forced me to use the -F
option to make it run.

I have a tool called Regexbuddy (www.regexbuddy.com) I will use for
testing...  Would be happy to furnish any results.

Jeff:
Thanks for the feedback.  I think I will take a look at SARE_FRAUD to see
why its slow.


<<Dan>>


 

>  -----Original Message-----
>  From: Chris Santerre [mailto:[EMAIL PROTECTED] 
>  Sent: Thursday, September 09, 2004 10:27 AM
>  To: [EMAIL PROTECTED]
>  Subject: RE: SARE_FRAUD vs SURBLs (Was: RE: Mass-check errors)
>  
>  
>  
>  >-----Original Message-----
>  >From: Jeff Chan [mailto:[EMAIL PROTECTED]
>  >Sent: Thursday, September 09, 2004 3:07 AM
>  >To: [EMAIL PROTECTED]
>  >Subject: SARE_FRAUD vs SURBLs (Was: RE: Mass-check errors)
>  >
>  >
>  >On Wednesday, September 8, 2004, 7:12:26 AM, Smart,Dan 
>  Smart,Dan wrote:
>  >> What I found was that the Textcat language rules was main 
>  time-sink, 
>  >> followed by the SARE_FRAUD ruleset.  Since SURBL now has the
>  >PH list, I
>  >> removed the FRAUD ruleset too.
>  >
>  >Dan,
>  >SARE_FRAUD has rules to catch text patterns in messages.  
>  It does not 
>  >look for phishing URI domains and IP addresses.  Therefore PH and 
>  >SARE_FRAUD are not equivalent, and you may want to keep 
>  using the SARE 
>  >rule, even if you are using PH in multi.surbl.org.
>  >
>  
>  
>  Ahhhh I missed this thread some how. So something in 
>  SARE_FRAUD is causing a slowdown? I've sent this to the 
>  ninjas. I will also look at this. I'm not familiar with 
>  Dprof at all. Time for another project I guess :) 
>  
>  We are also still working on some eval things. Just throwing 
>  around ideas.
>  Is it generally better to take a ruleset of say 30 avg size 
>  rules, and turn it into an eval? Does it gain, lose, or make 
>  no difference on performance?
>  
>  Also Dan, if you would be interested in doing performance 
>  testing on SARE stuff.....your Kung Fu looks pretty goood ;)
>  
>  Back on this topic, I think Dan is doing a trade off. 
>  Knowing that SARE_FRAUD and PH.surbl hit different things, 
>  yet same type spam, he is opting for the faster SURBL. 
>  
>  --Chris
>  
>

RE: SARE_FRAUD vs SURBLs (Was: RE: Mass-check errors)

Reply via email to