> Hi,
> I am losing confident in SA, the training process is
> pretty slow or it doesn't seem to be learning. 
> I am training SA with around 30-50 manually identified
> spam (moving spam mails to and spam folder created in
> squirrelmail and crond the sa-train  command on that
> folder every hour to train and delete them).   
> 
> The script is tested to be working on the shell before I
> put it on crond 
> 
> However, I found that the learning process is either not
> right or it is rather slow. 
> 
> I gone through the headers of the spams and found that
> even almost identical (in content) spams always got a
> score 0.1 and these spams are received on separated
> occasions across several days. This had made me losing
> confident on SA.    
> 
> I wonder if had it setup correct to detect and learn
> spams . I am using a default setup from qmail-toaster
> cnt50 , do I need more filters to harden my defense? Any
> recommendations you will be appreciated.   
> 
> Here are sample samples I taken from my mailbox on this
> server, 
> (eg, sample spam 1 and 8 are almost identical in content
> but they are both scored with only 0.1 .  : ( 
> 
> http://www.keac.com/id3303/spam-egs.txt

Mail #1 here

Content preview:  == US Drugstore == Voted as No.1 US pharmacy on Internet
  Over 80 meds on our online store We accept Visa, Master Card, JCB, Dinner
  & eCheck [...]

Content analysis details:   (17.4 points, 5.0 required)

 pts rule name              description
---- ---------------------- --------------------------------------------------
 3.0 RCVD_IN_XBL            RBL: Received via a relay in Spamhaus XBL
                            [68.243.81.116 listed in zen.spamhaus.org]
 0.9 RCVD_IN_PBL            RBL: Received via a relay in Spamhaus PBL
 2.0 RCVD_IN_BL_SPAMCOP_NET RBL: Received via a relay in bl.spamcop.net
               [Blocked - see <http://www.spamcop.net/bl.shtml?68.243.81.116>]
 1.2 INVALID_DATE           Invalid Date: header (not RFC 2822)
 1.2 TO_MALFORMED           To: has a malformed address
 4.0 BOTNET                 Relay might be a spambot or virusbot
[botnet0.8,ip=68.243.81.116,rdns=68-243-81-116.area7.spcsdns.net,maildomain=mediafutures.org,client,ipinhostname]
 1.0 BAYES_60               BODY: Bayesian spam probability is 60 to 80%
                            [score: 0.6572]
 0.1 RDNS_NONE              Delivered to trusted network by a host with no rDNS
 4.0 JM_SOUGHT_2            JM_SOUGHT_2


Reply via email to