--- Adriano Allora <[EMAIL PROTECTED]> wrote: > I didn't understand how to use the module HTML, but I need to count > how > many tags of several types appear in a web page and so I wrote this > script. > > Someone can tell me why this one doesn't work? > > %tags = ("paragraph" => "p", > "list_o" => "ol", > "list_no" => "ul", > "title" => "h1", > "ltl_title" => "h2|3|4|5", > "link" => "href"); >
First, I would suggest that you're trying to count two different things, tags and attributes. You may wish to separate them. The following code will do what you want. It uses the HTML::TokeParser::Simple module to make this relatively easy to read. Whether or not the data structures are the best way to handle this is another story. #!/usr/bin/perl use strict; use warnings; use HTML::TokeParser::Simple 3.13; my $parser = HTML::TokeParser::Simple->new( handle => \*DATA ); my %tag_for = ( "paragraph" => { name => "p", count => 0 }, "list_o" => { name => "ol", count => 0 }, "list_no" => { name => "ul", count => 0 }, "title" => { name => "h1", count => 0 }, "ltl_title" => { name => qr/h[2345]/, count => 0 }, ); my %attribute_for = ( "link" => { name => "href", count => 0 } ); while ( my $token = $parser->get_tag ) { foreach my $tag ( keys %tag_for ) { if ( $token->is_start_tag( $tag_for{$tag}{name} ) ) { $tag_for{$tag}{count}++; last; } } foreach my $attribute ( keys %attribute_for ) { if ( $token->get_attr( $attribute_for{$attribute}{name} ) ) { $attribute_for{$attribute}{count}++; last; } } } foreach my $type ( keys %tag_for ) { printf "%10s %3d\n", $type, $tag_for{$type}{count}; } print "\n"; foreach my $type ( keys %attribute_for ) { printf "%10s %3d\n", $type, $attribute_for{$type}{count}; } __DATA__ <head></head> <body> <h1>title</h1> <p>One P tag</p> <ul> <li>item</li> </ul> <h2>Little title 1</h2> <h2>Little title 2</h2> <h3>Little title 3</h3> <a href="foo.html">asdf</a> </body> And the output: list_o 0 list_no 1 title 1 ltl_title 3 paragraph 1 link 1 Cheers, Ovid -- If this message is a response to a question on a mailing list, please send follow up questions to the list. Web Programming with Perl -- http://users.easystreet.com/ovid/cgi_course/ -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>