On 21 Feb 2002, Craig Hughes wrote:
> > could someone please explain what does [^<] matches ?
> > afaik ^ means beginning-of-line but it's strange in [] character array.
> > so, what does ^ mean there? begin-of-line or '^' character?
> > i think it's beg-of-line, as PCRE couldn't optimize this re
On Thu, 2002-02-21 at 20:53, Craig Hughes wrote:
> On Thu, 2002-02-21 at 10:22, Arpi wrote:
[Original regexp]
> > > FOR_INSTANT_ACCESS:
> > > /(?:CLICK HERE|).{0,20}\s+INSTANT\s+ACCESS.{0,20}\s+(?:|CLICK HERE)/i
> I think
> body FOR_INSTANT_ACCESS /INSTANT ACCESS/i
> is fine by itself. I
> I don't want to spend many time making the patch, unless it goes immediately
> into CVS, as keeping it sync with CVS for weeks/months is a nightmare...
> If I have to do the fork&sync way, i'll fork everything and redesign ruleset
> syntax to better fit my needs for the C version...
Rules tend
Hi,
> On Thu, 2002-02-21 at 13:42, Arpi wrote:
> > when will it be implemented, or better: when will you accept such patch fo
> r
> > ruleset? (i cannot modify the perl code, as i don't know the perl languege
> > nor the spamassassin core enough, but i could help making this optimzation
> > to th
Ok, so this thread got my to go read through man perlre in a little more
detail. I've found the following as a result:
PerMsgStatus.pm uses $& and $', which apparently will cause *all* regex
matching to be much slower program wide. I'll try to rewrite the one
line on which that occurs; we shoul
Heh, yeah. My syntax would make it seem that it would allow that. And
I agree that allowing that would be better. But allowing that would
mean more coding ;) I'll probably do it anyway...
C
On Thu, 2002-02-21 at 14:20, Arpi wrote:
> Hi,
>
> > On 21 February 2002, Craig Hughes said:
> > > I
This syntax makes the rule parse more complicated, given the way it
works now. Though it is a little nicer because it makes it clearer that
something like:
rawbody A/rule1/
and header A /rule2/
will not work as expected.
C
On Thu, 2002-02-21 at 13:40, Greg Ward wrote:
> On 21 February 200
On Thu, 2002-02-21 at 13:42, Arpi wrote:
> when will it be implemented, or better: when will you accept such patch for
> ruleset? (i cannot modify the perl code, as i don't know the perl languege
> nor the spamassassin core enough, but i could help making this optimzation
> to the ruleset)
You ca
On 21 February 2002, Arpi said:
> anyway, i have a request:
> could you add a new rule type, for plain text matches?
> searching for a text string is always simpler and faster than for regexps,
> and many of your regexps are such strings (/some words/i) and there will be
> much more when start add
Hi,
> On 21 February 2002, Craig Hughes said:
> > I had been thinking about creating a "multiple-rule" format for rules,
> > where in order to match a rule, you would have to match a sequence of
> > regexes, eg:
> >
> > rawbody ASCII_FORM_ENTRY /_{30,}/
> > and rawbody ASCII_FORM_ENTRY /[
On 21 February 2002, Craig Hughes said:
> I had been thinking about creating a "multiple-rule" format for rules,
> where in order to match a rule, you would have to match a sequence of
> regexes, eg:
>
> rawbody ASCII_FORM_ENTRY /_{30,}/
> and rawbody ASCII_FORM_ENTRY /[^<][A-Za-z][A-Za-z]
Hi,
> I had been thinking about creating a "multiple-rule" format for rules,
> where in order to match a rule, you would have to match a sequence of
> regexes, eg:
>
> rawbody ASCII_FORM_ENTRY /_{30,}/
> and rawbody ASCII_FORM_ENTRY /[^<][A-Za-z][A-Za-z]+.{1,15}?\s+_{30,}/
>
> the "and"
I had been thinking about creating a "multiple-rule" format for rules,
where in order to match a rule, you would have to match a sequence of
regexes, eg:
rawbody ASCII_FORM_ENTRY /_{30,}/
and rawbody ASCII_FORM_ENTRY /[^<][A-Za-z][A-Za-z]+.{1,15}?\s+_{30,}/
the "and" prefix on a rule mean
On Thu, 2002-02-21 at 10:22, Arpi wrote:
> Hi,
>
> > I've ran my C version through your really big spam collection at night, and
> > filtered out 'slow' messages. Then I've checked which regexps makes them so
> > slow (slow mean 5..25 secs/mail on p4 1.8ghz).
>
> more on this...
>
> > FOR_INSTA
Hi,
> > > rawbody ASCII_FORM_ENTRY/[^<][A-Za-z][A-Za-z]+.{1,15}?\s+_{30,}/
> > [^<] means "any character except '<'".
> anyway, it explains why is this regexp so slow :(
> it partially matches at every character position of text, and only at the
> end (_{30,}) turns out that bad match..
Slightly more accurately, ^ as the *first* character inside [] means
not. Later in the [] it means ^
C
On Thu, 2002-02-21 at 10:42, Charlie Watts wrote:
> On Thu, 21 Feb 2002, Arpi wrote:
>
> > rawbody ASCII_FORM_ENTRY/[^<][A-Za-z][A-Za-z]+.{1,15}?\s+_{30,}/
> >
> > could someone pleas
Hi,
> On Thu, 21 Feb 2002, Arpi wrote:
>
> > rawbody ASCII_FORM_ENTRY/[^<][A-Za-z][A-Za-z]+.{1,15}?\s+_{30,}/
> >
> > could someone please explain what does [^<] matches ?
> > afaik ^ means beginning-of-line but it's strange in [] character array.
> > so, what does ^ mean there? begin-of
On Thu, 21 Feb 2002, Arpi wrote:
> rawbody ASCII_FORM_ENTRY/[^<][A-Za-z][A-Za-z]+.{1,15}?\s+_{30,}/
>
> could someone please explain what does [^<] matches ?
> afaik ^ means beginning-of-line but it's strange in [] character array.
> so, what does ^ mean there? begin-of-line or '^' char
Hi,
> I've ran my C version through your really big spam collection at night, and
> filtered out 'slow' messages. Then I've checked which regexps makes them so
> slow (slow mean 5..25 secs/mail on p4 1.8ghz).
more on this...
> FOR_INSTANT_ACCESS:
> /(?:CLICK HERE|).{0,20}\s+INSTANT\s+ACCESS.{0,
Hi,
I've ran my C version through your really big spam collection at night, and
filtered out 'slow' messages. Then I've checked which regexps makes them so
slow (slow mean 5..25 secs/mail on p4 1.8ghz).
Most 'slow' mails have many (>1000) repeats of a single char
(X...XXX
20 matches
Mail list logo