Re: sa-learn and modern spam sizes

Duane Hill Tue, 20 Dec 2011 07:39:33 -0800

On Tuesday, December 20, 2011 at 15:26:06 UTC, bowie_bai...@buc.com 
confabulated:


> On 12/19/2011 4:03 AM, Jonas wrote:
>>>> I've never seen spam larger than 3 MB.
>>> which is much bigger than the 256 kB limit in sa-learn that the OP is 
>>> having a
>>> problem with.
>> Indeed, of course I agree the avg. spam size is much much lower.
>>
>> But a lot of the "manual" spam, typically originating in asia where people 
>> send out spam through Hotmail/gmail can be 1-3MB in size. Most of these are 
>> electronics or textile oriented "business" offers.
>>
>> And my problem remains, our setup is based on MailScanner (a daemon like 
>> amavis-new) which doesn't use spamc/spamd so I'm unable to train my bayes on 
>> these 1MB+ size spams, which is a problem.
>>
>> So can I conclude that there's no real solution to this besides code change?
>>
>> Should I open a bug about it?

> If you're using mailscanner, why not ask on the mailscanner list?

> http://www.mailscanner.info/support.html#mailing

I  don't  believe  this  is  a  mailscanner  issue.  I also have found
sa-learn  not learning anything over a certain size. It's been a while
(a month) since I've seen large spam so I never bothered.

mailhost# ls -l notspam.msg
-rw-r--r--  1 duane  duane  363516 Dec 20 15:35 notspam.msg

mailhost# sa-learn --ham --username=du...@virtualnetwork.com notspam.msg
Learned tokens from 0 message(s) (0 message(s) examined)

If I chop everything out except the message headers:

mailhost# sa-learn --ham --username=du...@virtualnetwork.com notspam.msg
Learned tokens from 1 message(s) (1 message(s) examined)

-- 
If at first you don't succeed...
...so much for skydiving.

Re: sa-learn and modern spam sizes

Reply via email to