I had been thinking about creating a "multiple-rule" format for rules,
where in order to match a rule, you would have to match a sequence of
regexes, eg:

rawbody ASCII_FORM_ENTRY       /_{30,}/
and rawbody ASCII_FORM_ENTRY  /[^<][A-Za-z][A-Za-z]+.{1,15}?\s+_{30,}/

the "and" prefix on a rule means to add that requirement, instead of
overriding, and of course then have the thing do shor-circuit
evaluation.  I'll add this in bugzilla too.

C

On Thu, 2002-02-21 at 12:37, Arpi wrote:
> Hi,
> 
> > > > rawbody ASCII_FORM_ENTRY        /[^<][A-Za-z][A-Za-z]+.{1,15}?\s+_{30,}/
> 
> > > [^<] means "any character except '<'".
> 
> > anyway, it explains why is this regexp so slow :(
> > it partially matches at every character position of text, and only at the
> > end (_{30,}) turns out that bad match...
> 
> ok. i've developed the solution, resulting 3 times faster (!!!) execution!
> (and much more comes when i finnish it)
> 
> the trick is very simple: assign a single word for few regexps where it can
> help. currently i'm using:
> 
> strstr ASCII_FORM_ENTRY         ____________________
> strstr COMMUNIGATE              CommuniGate
> strstr WANTS_CREDIT_CARD        credit
> strstr ASKS_BILLING_ADDRESS     billing
> strstr CYBER_FIRE_POWER         FirePower
> strstr HR_3113                  3113
> strstr WORK_AT_HOME             HOME
> strstr MAILTO_LINK              mailto
> strstr YOUR_INCOME              income
> strstr BE_AMAZED                amazed
> strstr ITS_EFFECTIVE            effective
> 
> my code executes the (slow) regexp matching ONLY if the input text
> (header/body/rawbody...) contains the assigned word.
> it's very usefull for regexps containing a fixed word, but doesn't begin
> with something rare fixed char. for example, it reduced the execution time
> of ASCII_FORM_ENTRY from >1ms (sometimes >3s) to <0.1ms!
> 
> I don't know how usable/possible this in perl version, but i want to use
> such acceleration in the C version. So, would you accept a patch for ruleset
> adding such (i called it 'strstr' anyway it's not a good name) fields and
> commit to CVS?
> 
> anyway a different syntax should be introduced to make difference between
> case sensitive and insensitive word matching.
> (strstr and stristr or maybe strstr /word/i ?)
> 
> 
> A'rpi / Astral & ESP-team
> 
> --
> Developer of MPlayer, the Movie Player for Linux - http://www.MPlayerHQ.hu
> 
> _______________________________________________
> Spamassassin-talk mailing list
> [EMAIL PROTECTED]
> https://lists.sourceforge.net/lists/listinfo/spamassassin-talk
> 
> 


_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to