> Hello, > > I am using the HTML::Strip module to strip the HTML tags off of source > files, which I need to process. But it seems that anything after a <PRE> tag > is ignored. > > For example, in the file > > http://www.legis.state.ia.us/GA/76GA/Session.2/SJournal/Day/0228.html > > the vast majority of the text is ignored when I use the following code: > > open (INPUT, $file) or die $!; > $raw = <INPUT>;
Have you fudged the record separator so that the file is read in slurp mode? The above will only read a single line of the file otherwise. Are you doing, use strict; use warnings; At the top of your script? > my $p2 = HTML::Strip->new(); > $clean = $p2 -> parse ($raw); > > How do I save the text within the <PRE> tags? > Let's start there.... http://danconia.org -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>