On Jan 26, Mark Maunder said:

>I'm matching html using regex and use something like this to grab a
>chunk of text up to the next html tag:
>
><font>([^<]+)</font>
>
>But I'd like to say "match everything that does not include the string
><br>" rather than "match everything that does not include a "<"
>character. Anyone got any suggestions?

First, I don't suggest using regexes to parse HTML.

What you want, though, is:

  m{
    <font>
    ( (?: [^<]+ | < (?!/font>) )* )
    </font>
  }

The middle part of that regex says "match either 'one or more non-<' or 'a
< that is not followed by /font' zero or more times".

-- 
Jeff "japhy" Pinyan      [EMAIL PROTECTED]      http://www.pobox.com/~japhy/
RPI Acacia brother #734   http://www.perlmonks.org/   http://www.cpan.org/
<stu> what does y/// stand for?  <tenderpuss> why, yansliterate of course.
[  I'm looking for programming work.  If you like my work, let me know.  ]


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to