> Using the excellent example in the an earlier post from david:
> RE: Removing HTML Tags
>
> I came up with this slightly modified version based on the
> post and some cpan documentation and it works.
> It just brought up a few more questions.
> Basically I'm just trying to grab the body contents without
> comments or script stuff.
>
> So far this module is really cool and handy!!
>
> #!/usr/bin/perl
>
> use HTML::Parser;
>
> my $text = <<HTML;
>
> <html><head>
> <title> HI Title </title>
> heaD STUFF
> </head>
> <body bodytag=attributes>
> hI HERE'S CONTENT i WANT
> <!-- i WANT TO STRIP COMMENTS OUT -->
> <SCRIPT>
>
> i DON'T WANT THIS SCRIPT EITHER
>
> </SCRIPT>
>
> </BODY>
> </HTMl>
>
> HTML
>
> my $html = HTML::Parser->new(
> api_version => 3,
> text_h => [sub{ print shift;}, 'dtext'],
> start_h => [sub{ print shift;}, 'text'],
> end_h => [sub{ print shift;}, 'text']);
Ok I see why it's printing. I tell it to right here!
Instead of print shift; I do $temp .= shift; and now $temp holds that data.
One down two to go!
>
> #Q) Before I kill the head section or body tags below how do
> I grab these parts of it?
> # 1 - my $title = ???? IE the text between title tags
> # 2 - get body tag attributes my $body_attributes = ????
> IE in this example it'd be 'bodytag=attributes'
>
> $html->ignore_elements(qw(head script));
> $html->ignore_tags(qw(html body));
>
> $html->parse($text);
> $html->eof;
>
> ####
>
> It automatically prints the modified version of $text without
> any print statement.
> Q) Why is that?
> Q) How can I save the new version of $text to a new variable
> instead of automatically printing it to the screen?
> ( so I can remove empty lines and have my way with it )
> Q) I wanted any comments removed too but I didn't do anything
> special to it and they are gone anyway, are comments removed
> automatically then?
>
> OUTPUT ::
> (dmuey@q42(~):21)$ ./html.pl
>
>
>
> hI HERE'S CONTENT i WANT
>
>
>
>
>
>
> (dmuey@q42(~):22)$
>
>
> Thanks
>
> Dan
>
> --
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]