[EMAIL PROTECTED] wrote: > ################ TEXT FILE ################## > <td class="PhorumTableRowAlt thread" style="padding-left: 0px"> > > <a href="http://mysite.com/link/here_goes?id=239">LINK</a> > > <span class="PhorumNewFlag"></span></td> > > <td class="PhorumTableRowAlt" nowrap="nowrap" width="150"> > <a href="http://mysite.com/link/here_goes?id=239">LINK</a> </td> > <td class="PhorumTableRowAlt PhorumSmallFont" nowrap="nowrap" > width="150">06/11/2007 12:29AM > </td> > </tr> > ############################################ > > The text file contains hundreds of tds structure like above. All I need is to > extract the td with class "PhorumTableRowAlt thread". I have tried every > possible option, but finally I am coming to you for any Regex for it? TIA. > > HERE IS WHAT I AM DOING: > > pen(TXT, "links.txt") or die "Unable to open file"; > my @links = <TXT>; > close (TXT); > foreach my $link(@links) { > if ($link =~ m|<td class="PhorumTableRow thread" style="padding-left: > 0px">(.*?)</td>|gsi) { > print "$1";} > } > > > > But NOTHING coming up. No results. > > Thanks for any help. > > Sara. >
Parsing HTML with regexes is just a bad idea. Try a module from CPAN, I've had good luck with HTML::TokeParser::Simple, http://search.cpan.org/perldoc?HTML::TokeParser::Simple http://danconia.org -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/