On Wed, 2009-09-09 at 02:21 +0200, Mark Martinec wrote:
> On Tuesday September 8 2009 21:23:42 Jason Haar wrote:
> > Actually, it's HAM - not spam. In the end it's really become clear it
> > shows limitations in perl's parsing power - so either we get gruntier
> > boxes - or increase the timeout. We've gone with the latter.
> 
> Some regexps do perform terribly when given a large chunk of text
> with multiple matching opportunities. Some monolithic HTML with

Ah, good point, Mark -- that reminds me of the infamous issue of
un-bound or nested quantifiers in RE rules. In some pathological cases,
I've even debugged these to be the culprit of bringing SA down to its
knees.

Any custom rules? Do you still see the same timing when disabling them
temporarily? Might be worth a shot.


> nested tables is one such example. If you still have the sample message,
> it would be interesting to try it on the current 3.3 code. The main
> difference could come from the fact that the current code splits
> mail text into smaller chunks and does not allow a rule regexp to
> work on an entire mail block. In some corner cases this brings a
> significant speedup, while on most of the rest it makes no difference.

-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Reply via email to