On Tuesday, December 20, 2011 at 15:26:06 UTC, bowie_bai...@buc.com confabulated:
> On 12/19/2011 4:03 AM, Jonas wrote: >>>> I've never seen spam larger than 3 MB. >>> which is much bigger than the 256 kB limit in sa-learn that the OP is >>> having a >>> problem with. >> Indeed, of course I agree the avg. spam size is much much lower. >> >> But a lot of the "manual" spam, typically originating in asia where people >> send out spam through Hotmail/gmail can be 1-3MB in size. Most of these are >> electronics or textile oriented "business" offers. >> >> And my problem remains, our setup is based on MailScanner (a daemon like >> amavis-new) which doesn't use spamc/spamd so I'm unable to train my bayes on >> these 1MB+ size spams, which is a problem. >> >> So can I conclude that there's no real solution to this besides code change? >> >> Should I open a bug about it? > If you're using mailscanner, why not ask on the mailscanner list? > http://www.mailscanner.info/support.html#mailing I don't believe this is a mailscanner issue. I also have found sa-learn not learning anything over a certain size. It's been a while (a month) since I've seen large spam so I never bothered. mailhost# ls -l notspam.msg -rw-r--r-- 1 duane duane 363516 Dec 20 15:35 notspam.msg mailhost# sa-learn --ham --username=du...@virtualnetwork.com notspam.msg Learned tokens from 0 message(s) (0 message(s) examined) If I chop everything out except the message headers: mailhost# sa-learn --ham --username=du...@virtualnetwork.com notspam.msg Learned tokens from 1 message(s) (1 message(s) examined) -- If at first you don't succeed... ...so much for skydiving.