Hi, Don't use regular expressions for matching.
use HTML::TreeBuilder; my $tree = HTML::TreeBuilder->new_from_content($html_content); my $div = $tree->look_down( _tag => 'div', id => 'product', class => 'product' ); my $table = $div->look_down( _tag => 'table', class => 'prodc' ); #Here you can get the table components like: my @tr = $table->look_down( _tag => 'tr' ); for my $tr ( @tr ) { my @td = $tr->look_down( _tag => 'td' ); print $td[0]->as_text; } Or you can do many more or do much more complex searching for HTML elements using HTML::TreeBuilder. Read: perldoc HTML::TreeBuilder perldoc HTML::Element --Octavian ----- Original Message ----- From: mimic...@gmail.com To: beginners@perl.org Sent: Tuesday, November 18, 2014 10:22 PM Subject: Match HTML <div> ...... </dv> string over multiple I am trying to extract a table (<table class="xxxx"><tr><td>...... until </table>) and its content from an HTML file. With the file I have something like this <div id="product" class="product"> <table border="0" cellspacing="0" cellpadding="0" class="prodc" title="Product "> . . . </table> </div> There could be more that one table in the file.however I am only interested in the table within <div id="product" class="product"> </div>. /^.*<div id="product" class="product">.+?(<table border="0".+?\s+<\/table>)\s*<\/div>.*$/ims The above and various variations I tried do not much. I am able to easily match this using sed, however I need to try using perl. This sed work just fine: sed -n '/<div id="product" class="product">/,/<\/table>/p' thelo826.html |sed -n '/<table border.*/,/<\/table>/p'| sed -e 's/class=".*"//g' Thanks Mimi