Hello Bob,

Saturday, October 18, 2003, 9:04:18 AM, you responded to my query:

>> > spamassassin --lint
>> returns with no errors on the two 2.55 systems. However, starting with
>> the migration to 2.60 on the third system, lint with the same user_prefs
>> gives:
>> > Quantifier unexpected on zero-length expression before HERE mark in regex
>> > m/free\b?of\b?charge << HERE / at (no file), rule RM_bp_FreeOfCharge, line 1.
>> The rule which this applies to seems to be:
>> > body     RM_bp_FreeOfCharge  /free\b?of\b?charge/i
>> > describe RM_bp_FreeOfCharge  Body mentions something free of charge
>> > score    RM_bp_FreeOfCharge  0.10

BA> Do you mean /free\s?of\s?charge/i instead?

BA> /b is a word boundary; like ^ and $, it's a "zero-width assertion",
BA> looking for "a spot between two characters that has a "\w" on one
BA> side of it and a "\W" on the other side of it" (from `perldoc perlre`)

As a word boundary, would not \b also match . , / ?

My email client's regular expression help offers:
> Regular-Expression Assertions
> Assertion  Matches            Example   Matches     Doesn't Match
> ^          Start of string    ^fool     foolish     Tomfoolery
> $          End of string      fool$     April fool  Foolish
> \b         Word boundary      be\bside  be side     Beside
> \B         Nonword boundary   be\Bside  beside      be side
I wouldn't have any problem with this documentation being wrong, but
from this documentation my understanding is that e\b?o should match
"e o", "e.o", "e-o", and "eo" (the latter if there is no word
boundary). Am I reading this wrong?

BA> '\b?' doesn't make sense with or without the '?' since 'e\bo' and 'f\bc'
BA> are always false (e,o,f,c are all matched by \w so you have \w on both
BA> sides of the \b...) Someone please correct me on this but it looks like
BA> this regex is equivalent to /freeofcharge/i -- it looks like lint is
BA> doing the right thing here.

BA> You probably want
BA>   /free\s+of\s+charge/i
BA> or
BA>   /free\s{0,10}of\s{0,10}charge/i  (similar to /free\s*of\s*charge/i)
BA> The difference between /b and /s tripped me up for a long time because I
BA> kept thinking 'blank' not 'boundary' when I saw '/b'.

The problem is that I don't want just blanks, but want all other types
of word boundary conditions as well, periods, commas, dashes, etc.
Maybe if \b is wrong, then what I want is "/free\W?of\W?charge/i" or
"/free\W{0,10}of\W{0,10}charge/i" ???

Thanks.

Bob Menschel



-------------------------------------------------------
This SF.net email sponsored by: Enterprise Linux Forum Conference & Expo
The Event For Linux Datacenter Solutions & Strategies in The Enterprise 
Linux in the Boardroom; in the Front Office; & in the Server Room 
http://www.enterpriselinuxforum.com
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to