Offer Kaye wrote on 23.03.2005: >Change your RE to: m#<h1>(.+?)</h1>(.+?)(?=<h1>|$)#gs > >In other words, look ahead to either a <h1> or the end of the string >("$"). I have to admit this problem wasn't as simple as I initially >thought - I still have no idea why my first guess didn't work: >m#<h1>(.+?)</h1>(.+?)(?=<h1>)?#gs > >Maybe someone with more knowledge of REs can answer?
John W. Krahn wrote on 23.03.2005: >This should work (untested) > >while ($content =~ m#<h1>(.+?)</h1>(.+?)(?=<h1>|\z)#gs) { Hi, and thanks. I tried Offer Kaye's first guess, too, and I think I can explain why it does not work. If you make the lookahead optional, the regex will try to match as few characters as possible for the second parentheses - and since the lookahead is optional, this will be only a single character. You have to force a positive lookahead assertion to make sure $2 receives everything up to either the next <h1> or the end of the string. So the other suggestion works. Thank you! The reason I had not tried that was the wrong assumption that alternations in lookahead/lookbehind assertions had to be of the same length, like in (?=abc|def), but not (?=abc|defg). But now I remember that the whole lookahead/lookbehind has to be of a fixed length, so you cannot use quantifiers. Thanks again, Jan -- A common mistake that people make when trying to design something completely foolproof is to underestimate the ingenuity of complete fools. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>