On Wed, 2010-10-06 at 14:38 -0700, durwood wrote: > > Because it *is* filed already. Please first search bugzilla, then open > > a bug report. > > Pinging this thread to see if there's been any progress or decision on this > bug.
Wow, that thread's more than a year old. :) A lot of folks are likely to miss this follow-up, somewhere deep in the mail folder. > I too am starting to see quite a bit of spam that's *just* over the 500k > threshold due to ~4K-sized image attached to the spam. It almost makes me > wonder if they are doing this just to get it over the standard SpamAssassin > threshold. If it's just over the threshold, why not just raise it? It's a default, but configurable for a reason. To reiterate some related statements from that thread and others since: With a higher threshold, overall memory consumption by SA will raise as well. However, with attachments like that, only a very few rules will be affected at all, namely rawbody rules. Anything else will never even get to see that data blob. IMHO, it should be perfectly safe in most cases to raise the threshold, if you're seeing spam just over it. Like, say, 600k? 750k? It's a trade-off, and you have to decide. However, think about it. How many messages (both spam and ham) do you receive over 500k a day? Over 1M? Would the overhead be worth it, to catch that spam, or would taking care of them manually be ok? What *is* that overhead anyway? Even if you are wasting cycles on one larger ham per user per day (those between old and new threshold) after raising the threshold, would you even notice the additional load? Would you notice 10 of them? Is your system that maxed out? > It seems like the size limit should be applied to the searchable parts of > the email, not any attached images. This is rather unlikely to happen. There is *no* size limit in SA. There is, however, one in the lightweight spamc client. Not taking binary attachments into account would require spamc to understand and parse the MIME structure. Granted, there are good libs for that out there -- but the overhead, code wise and as a build dependency, is non-trivial. -- char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1: (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}