> -----Original Message-----
> From: Jason Tiller [mailto:[EMAIL PROTECTED]]
> Sent: Friday, September 14, 2001 5:41 PM
> To: '[EMAIL PROTECTED]'
> Subject: RE: Matching a pattern only once
>
>
> Hello, Again, Bob, :)
>
> On Fri, 14 Sep 2001, Bob Showalter wrote:
>
> > You can use look-behind assertion:
> >
> > /(?<!~)~$/
>
> > Which means, match a tilde, not preceded by a tilde, anchored to the
> > end of the string. This will match:
>
> > foo~
> > ~
> >
> > But not:
> >
> > foo~~
> > ~~
>
> I'm trying to understand what this means (I'm a beginner!). I've gone
> through the perl RE tutorial (perldoc perlretut) and my head is now
> spinning like a top out of control. Sheesh, these things are
> complicated!!
>
> So, I'm trying to figure out the difference between:
>
> /[^~]~$/
>
> and
>
> /(?<!~)~$/
Ok, but remember that I suggested /(?:^|[^~])~$/ as an alternate to
the look-behind assertion. /[^~]~$/ is *not* equivalent.
>
> From what I've read from the perlretut, (?<!~) is a zero-length
> assertion, right?
Yes.
>
> To test these, I ran the following script:
>
> #!/usr/bin/perl
>
> @a = ( "a~", "a~~", "~", "~~" );
>
> foreach ( @a ) {
> print "[~^] match: $_\n" if /[^~]~$/;
> print "(?<!~) match: $_\n" if /(?<!~)~$/;
> }
>
> The output is:
>
> [^~] match: a~
> (?<!~) match: a~
> (?<!~) match: ~
>
> The first regexp has three parts to it:
>
> [^~]~$
> ^ ^^
> | ||
> 1 23
>
> When matching with this regexp, perl walks through the string looking
> for part 1 (non-tilde); when it finds a non-tilde character, then it
> looks to match part 2 by ensuring that the next character *is* a
> tilde. Finally, if part 2 matches, then it looks to make sure there
> are no characters following, which is part 3 ($ - end string anchor).
Yes, essentially.
>
> So, "~" and "~~" don't match because perl can't match part 1, which is
> the non-tilde character - there *aren't* any non-tilde characters in
> the string. However, I gather that in this case we *want* "~~" to
> match, so this regexp doesn't suit our needs.
I assume you mean that you want '~' to match, which this regex
does not. Which is why I suggested /(?:^|[^~])~$/ as an alternate,
and not /[^~]~$/
>
> The second regexp has three parts to it as well:
>
> (?<!~)~$
> ^ ^^
> | ||
> 1 23
>
> This is very similar, but part 1 is different. "(?<!~)" is a "negated
> lookbehind zero-length assertion", which is a kind of anchor (right?).
> In this case, when walking through the string looking for a match,
> perl looks for part *2* first, *not* part 1. In other words, perl
> first looks for a tilde. When matches part 2 (finds a tilde), then it
> looks to see that there are no more characters in the string (part 3).
> *Then* perl looks to match part 1, which says that the character
> before the tilde matched in part 1 must *not* be a tilde.
Yes, something like that.
>
> Thus "a~" and "~" match, but "~~" does not.
Correct.
>
> Bob, is my summary correct? I'm just trying to get a handle on this.
> It's obvious that Perl RE's are *incredibly* powerful but there seem
> to be so many things to remember...
Well, I've been making my share of mistakes here lately, but as far
as I can tell, your analysis is right on. The exact algorithm that Perl
uses is not known to me, but the results are as you described.
Now, can you tell why '~' is matched by
/(?:^|[^~])~$/
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]