Wiggins,

Thanks for writing back.

> -----Original Message-----
> From: Wiggins d Anconia [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, May 06, 2004 2:48 PM
> To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
> Subject: Re: FW: HTML Strip and <PRE> tag?
> 
> 
> > Hello,
> > 
> > I am using the HTML::Strip module to strip the HTML tags 
> off of source 
> > files, which I need to process. But it seems that anything after a
> <PRE> tag
> > is ignored.
> > 
> > For example, in the file
> > 
> > 
http://www.legis.state.ia.us/GA/76GA/Session.2/SJournal/Day/0228.html
> > 
> > the vast majority of the text is ignored when I use the following 
> > code:
> > 
> >     open (INPUT, $file) or die $!;
> >     $raw = <INPUT>;
> 
> Have you fudged the record separator so that the file is read 
> in slurp mode?  The above will only read a single line of the 
> file otherwise. Are you doing,
> 
> use strict;
> use warnings;
> 
> At the top of your script?


Yes, I use those 2 options. I havn't changed the record separator, but I do
know I get much more than the first line of the file I mentioned. In fact, I
get all the multiples of lines before the <PRE> tag, and all the lines after
the </PRE> tag (which is 95% of the file). Plus this code has worked fine
for the many files I've processed before, so I don't think the record
separator is at fault. Very curious.


> 
> >     my $p2 = HTML::Strip->new();
> >     $clean = $p2 -> parse ($raw);
> > 
> > How do I save the text within the <PRE> tags?
> > 
> 
> Let's start there....
> 
http://danconia.org

Thanks!


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to