Jan Eden wrote:

John W. Krahn wrote on 23.03.2005:

This should work (untested)

while ($content =~ m#<h1>(.+?)</h1>(.+?)(?=<h1>|\z)#gs) {

and thanks. I tried Offer Kaye's first guess, too, and I think I can explain why it does not work.

If you make the lookahead optional, the regex will try to match as
few characters as possible for the second parentheses - and since
the lookahead is optional, this will be only a single character.

You have to force a positive lookahead assertion to make sure $2
receives everything up to either the next <h1> or the end of the
string.

So the other suggestion works. Thank you! The reason I had not tried
that was the wrong assumption that alternations in
lookahead/lookbehind assertions had to be of the same length, like
in (?=abc|def), but not (?=abc|defg). But now I remember that the
whole lookahead/lookbehind has to be of a fixed length, so you cannot
use quantifiers.

lookahead CAN use quantifiers but lookbehind CANNOT.


John -- use Perl; program fulfillment

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>




Reply via email to