Thanks very much Tim.  I just did a quick test on my real file and it 
worked perfectly. 

I definitely still have a lot to learn with both Perl and regex's, so I 
really appreciate the explanation as well.  Though your script is very 
compact, I learned a lot from it.  Such as how you initialized the array. 
I have a couple of scripts where I get warnings about either improper or 
uninitialized arrays, or something to that effect.  I tried to fix those, 
but was unsuccessful.  Those scripts produced the output I wanted, but the 
warnings are bothersome. I'll take another look at those scripts to see if 
initializing using "my @arrayname = ( );" will help. 

Also, the "push" structure for adding elements to the array was very 
helpful.  I have a way to do it, and while my way works and is somewhat 
creative, my way is actually really embarrassingly bad  and inefficient 
coding.  So, I learned from that too.

It's funny how all this stuff is in the Perl books that I've been reading, 
but once I need to solve a problem, the exact right way to do it doesn't 
come to me.  I can spend hours trying to do some pretty simple stuff.  I 
can usually come up with a solution, but I know that it's not usually 
efficient nor is it really close to the right way to do it.   But, the 
good news is, if I think about where my Perl skills are today compared to 
a month ago, I'm making progress !

Anyway, sorry for being so looong winded.  The bottom line is that I 
really appreciate your help. 
 



"Tim Johnson" <[EMAIL PROTECTED]> 
01/23/2004 01:32 AM

To
"Tim Johnson" <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]>, 
<[EMAIL PROTECTED]>
cc

Subject
RE: Need help with a regex






Ooh.  That's embarassing.  I didn't pay close enough attention to the OP. 
Some of the inside matches contain spaces.  My regex should have been:
 
/^\S+\s+(.+)\s+/
 
which would match:

*                the beginning of the line (^)
*                followed by one or more non-whitespace characters (\S+)
*                followed by one or more whitespace characters (\s+)
*                followed by one or more of any characters including 
whitespace (.+)
*                followed by one or more whitespace characters (\s+)

because Perl will match the largest possible number of characters, the .+ 
will match everything between the two outside spaces.

                 -----Original Message----- 
                 From: Tim Johnson 
                 Sent: Thu 1/22/2004 9:31 PM 
                 To: [EMAIL PROTECTED]; [EMAIL PROTECTED] 
                 Cc: 
                 Subject: RE: Need help with a regex
 
 

                 Try this on for size:
 
                 #####################
                 use strict;
                 use warnings;
                 my @cities = ();
                 open(INFILE,"myfile.txt") || die "Couldn't open 
myfile.txt for reading!\n";
                 while(<INFILE>){
                      $_ =~ /^\S+\s+(\S+)/;
                      push @cities,$1;
                 }
                 #do something to @cities
 
                 #####################
 
                 which basically means to match:
 
                 *       the start of the line (^)
                 *       followed by one or more non-whitespace characters 
(\S+)
                 *       followed by one or more whitespace characters 
(\s+)
                 *       followed by one or more non-whitespace characters 
(\S+)
 
                 the parentheses around the last non-whitespace match 
assign it to $1
 
                 Note:  Check out "perldoc perlre" for the man pages.  It 
might be worth looking over real quick before you dig into the book.
 
                 Or, for the quick and easy way without a regex, how bout:
 
                 #############################
 
                 use strict;
                 use warnings;
                 my @cities;
                 open(INFILE,"myfile.txt") || die "Could not open 
myfile.txt for reading!\n";
                 while(<INFILE>){
                    push @cities,(split /\s+/,$_)[1];
                 }
 
                 #############################
 
                 which does a split on the line and returns the second 
element of the resulting list and assigns it to @cities.
 
                         -----Original Message-----
                         From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED]
                         Sent: Thu 1/22/2004 9:01 PM
                         To: [EMAIL PROTECTED]
                         Cc:
                         Subject: Need help with a regex
 
 
 
                         This newbie needs help with a regex.  Here's what 
the data from a text
                         file looks like. There's no delimiter and the 
fields aren't evenly spaced
                         apart.
 
                         apples          San Antonio      Fruit
                         oranges Sacramento             Fruit
                         pineapples     Honolulu         Fruit
                         lemons    Corona del Rey       Fruit
 
                         Basically, I want to put the city names into an 
array.  The first field,
                         the fruit name, is always one word with no 
spaces.
 
 
 
 


Reply via email to