Hi Raito,

On Sat, 26 Nov 2011 23:31:06 -0600
Raito Garcia <saintar...@gmail.com> wrote:

> hi
> 
> Well today i have another dude, I have a HTML file like this content:
> 
> 
> </td></tr>
> <tr><td colspan="2"><hr></td></tr>
> <tr><td>
> <span class="host_info">Remote host information</span><br><table
> align="center" border="0" width="60%">
> <tbody><tr>
> <td align="left">Operating System : </td>
> <td align="right">Windows 7 Enterprise</td>
> </tr>
> <tr>
> <td align="left">NetBIOS name : </td>
> <td align="right">GUSR712DPO16125</td>
> </tr>
> <tr><td align="left">DNS name : </td></tr>
> </tbody></table>
> 
> 

You shouldn't parse HTML with regexes. See:

* <perlbot>     rindolf: Don't parse or modify html with regular
  expressions! See one of HTML::Parser's subclasses: HTML::TokeParser, 
HTML::TokeParser::Simple, HTML::TreeBuilder(::Xpath)?, HTML::TableExtract etc. 
If your response begins "that's overkill. i only want to..." you are wrong. 
http://en.wikipedia.org/wiki/Chomsky_hierarchy and http://xrl.us/bf4jh6 for why 
not to use regex on HTML

*
  
http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454

* http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html

* http://perl-begin.org/uses/text-parsing/

Use a parser. Now to answer your question in its context (and to comment on
your code).

> My code is this:
> 
> #!/usr/bin/perl
> 

Always add "use strict;" and "use warnings;" at the top of the code.

> #Variables
> my $line="";
> my $aux=0;

Don't predeclare your variables at the top - declare them when you are using
them. 

> 
> #REGEX
> 
> open(H,"Report.html") || die "No se puede abrir el archivo:$!";

1. Use three-args open.

2. Use lexical file handles.

See:

http://perl-begin.org/tutorials/bad-elements/

>  while($line=<H>){
>         chomp($line);
>         if ($line =~ /<td align\=\"left\">NetBIOS\ name\ :\ <\/td>/){

You don't have to escape =, " and whitespace. If you have "/"s in the string,
you can use a different delimiter:

        if ($line =~ m{...})

Also it seems you're looking for a substring. For that you can use \Q and \E or
http://perldoc.perl.org/functions/index.html

>                 print "Name:\t";
>                 print $',"\n";

This won't work because $' does not contain the next line - it only contains
the rest of the line after the match. If you want to print the next line, you
can either do:

        my $next_line = <$html_fh>;
        chomp($next_line);
        print $next_line;

Or alternatively set a global-to-the-loop boolean flag that will be checked and
reset.

Regards,

        Shlomi Fish

-- 
-----------------------------------------------------------------
Shlomi Fish       http://www.shlomifish.org/
"Star Trek: We, the Living Dead" - http://shlom.in/st-wtld

Chuck Norris is his own boss. If you hire him, he’ll tell your boss what to
do.

Please reply to list if it's a mailing list post - http://shlom.in/reply .

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to