Wiggins, Thanks for writing back.
> -----Original Message----- > From: Wiggins d Anconia [mailto:[EMAIL PROTECTED] > Sent: Thursday, May 06, 2004 2:48 PM > To: [EMAIL PROTECTED]; [EMAIL PROTECTED] > Subject: Re: FW: HTML Strip and <PRE> tag? > > > > Hello, > > > > I am using the HTML::Strip module to strip the HTML tags > off of source > > files, which I need to process. But it seems that anything after a > <PRE> tag > > is ignored. > > > > For example, in the file > > > > http://www.legis.state.ia.us/GA/76GA/Session.2/SJournal/Day/0228.html > > > > the vast majority of the text is ignored when I use the following > > code: > > > > open (INPUT, $file) or die $!; > > $raw = <INPUT>; > > Have you fudged the record separator so that the file is read > in slurp mode? The above will only read a single line of the > file otherwise. Are you doing, > > use strict; > use warnings; > > At the top of your script? Yes, I use those 2 options. I havn't changed the record separator, but I do know I get much more than the first line of the file I mentioned. In fact, I get all the multiples of lines before the <PRE> tag, and all the lines after the </PRE> tag (which is 95% of the file). Plus this code has worked fine for the many files I've processed before, so I don't think the record separator is at fault. Very curious. > > > my $p2 = HTML::Strip->new(); > > $clean = $p2 -> parse ($raw); > > > > How do I save the text within the <PRE> tags? > > > > Let's start there.... > http://danconia.org Thanks! -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>