Re: regexp & html-tags

John W. Krahn Sat, 05 Jul 2003 12:54:49 -0700

Michele Marcionelli wrote:
> 
> Hello Beginners,

Hello,


> I was looking for a mailing list about REGEXP but I didn't find it. Maybe
> there is somebody here that can help me...
> 
> Suppose you have the following string
> 
>     $str = "... <b>John <i>Smith</i> (male)</b> and <b>Elisabeth
> <i>Jones</i> (female)</b> ..."
> 
> and that you want to find out the container of the first B-tag, that is you
> want to get the string "John <i>Smith</i> (male)". Now, if I use the
> following regexp
> 
>     $str =~ m/<b>(.*)<\/b>/i;
> 
> I get as result, i.e. $1:
> 
>     "John <i>Smith</i> (male)</b> and <b>Elisabeth <i>Jones</i> (female)"
> 
> and not what I wanted!!

That is because the quantifiers ?, * and + are greedy.  You need to
append a ? to the quantifier to make it non-greedy (??, *? and +?).

      $str =~ m!<b>(.*?)</b>!i;

perldoc perlre

If you need to parse HTML you should really install a module designed to
parse HTML because regular expressions are not the best method.

http://search.cpan.org/search?query=parse+HTML&mode=module


John
-- 
use Perl;
program
fulfillment

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: regexp & html-tags

Reply via email to