On Tuesday September 8 2009 21:23:42 Jason Haar wrote: > Actually, it's HAM - not spam. In the end it's really become clear it > shows limitations in perl's parsing power - so either we get gruntier > boxes - or increase the timeout. We've gone with the latter.
Some regexps do perform terribly when given a large chunk of text with multiple matching opportunities. Some monolithic HTML with nested tables is one such example. If you still have the sample message, it would be interesting to try it on the current 3.3 code. The main difference could come from the fact that the current code splits mail text into smaller chunks and does not allow a rule regexp to work on an entire mail block. In some corner cases this brings a significant speedup, while on most of the rest it makes no difference. http://people.apache.org/~jm/mcsnapshot.tgz Mark