RE: FW: HTML Strip and tag?

2004-05-06 Thread Boris Shor
Wiggins, Thanks for writing back. > -Original Message- > From: Wiggins d Anconia [mailto:[EMAIL PROTECTED] > Sent: Thursday, May 06, 2004 2:48 PM > To: [EMAIL PROTECTED]; [EMAIL PROTECTED] > Subject: Re: FW: HTML Strip and tag? > > > > Hello, > > > > I am using the HTML::Strip module

FW: HTML Strip and tag?

2004-05-06 Thread Boris Shor
Hello, I am using the HTML::Strip module to strip the HTML tags off of source files, which I need to process. But it seems that anything after a tag is ignored. For example, in the file http://www.legis.state.ia.us/GA/76GA/Session.2/SJournal/Day/0228.html the vast majority of the text is igno

RE: Regular expression question: non-greedy matches

2004-04-21 Thread Boris Shor
Joseph, Thanks for writing and the advice. Here's another crack at the question. > -Original Message- > From: R. Joseph Newton [mailto:[EMAIL PROTECTED] > Sent: Monday, April 05, 2004 5:39 PM > To: [EMAIL PROTECTED] > Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; 'Stuart > V. Jordan' > Subje

FW: Regular expression question: non-greedy matches

2004-04-05 Thread Boris Shor
Thanks for writing. I get no warnings when I use (ActiveState Perl on Windows): use Strict; use Warnings; $test = "Yea 123xrandomYea 456xdumdumNay 789xpop"; while ($test =~ /Yea (.*?)x.*?(Nay (.*?)x)?/g) { print "$1\n"; print "$2\n"; } What I am looking for are pairs: $1

RE: Regular expression question: non-greedy matches

2004-04-05 Thread Boris Shor
this, I get no matches on the 'nays' or $2. -Original Message- From: Randy W. Sims [mailto:[EMAIL PROTECTED] Sent: Sunday, April 04, 2004 9:30 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: Regular expression question: non-greedy matches Boris Shor wrote: > H

Regular expression question: non-greedy matches

2004-04-04 Thread Boris Shor
Hello, Perl beginner here. I am having difficulty with a regular expression that uses non-greedy matches. Here is a sample code snippet: $test = "Yea 123xrandomYea 456xdumdumNay 789xpop"; while ($test =~ /Yea (.*?)x.*?(?:Nay (.*?)x)?/g) { print "$1\n"; print "$2\n"; } The

Regular expression with lookbehinds question

2004-02-26 Thread Boris Shor
Hello everyone, I'm trying to implement the following regular expression with a lookbehind: $e1 = ','; $aye =~ s/(?http://learn.perl.org/>

TokeParser and get_trimmed_text question

2004-01-29 Thread Boris Shor
Hello, New Perl programmer here. I am using HTML::TokeParser to parse HTML files. It is really very useful. In particular, I use the get_trimmed_text() function quite a bit to extract tag-free text from HTML files. I usually use the function in this fashion: $x = $p -> get_trimmed_text('/strong'

Glob and space in directory name

2003-11-26 Thread Boris Shor
Why does the following work (eg, give me an array filled with matching file names): @filelist = glob("w:/stleg/Colorado/House_98/*.htm"); And when I rename the directory to "House 98" (space instead of underscore), the following does not: @filelist = glob("w:/stleg/Colorado/House 98/*.htm"); Th

TokeParser help

2003-11-19 Thread Boris Shor
Hello, I am a Perl newcomer, and I'm trying to use the TokeParser module to extract text from an HTML file. Here's the Perl code: use HTML::TokeParser; my $p = HTML::TokeParser->new("test.htm"); while ($p -> get_tag('b')) { print $p -> get_text(),"\n"; } This works only on bold tags