Re: Need help with a regex

Jeff 'japhy' Pinyan Fri, 23 Jan 2004 08:37:17 -0800

On Jan 23, [EMAIL PROTECTED] said:

>This newbie needs help with a regex.  Here's what the data from a text
>file looks like. There's no delimiter and the fields aren't evenly spaced
>apart.
>
>apples          San Antonio      Fruit
>oranges Sacramento             Fruit
>pineapples     Honolulu         Fruit
>lemons    Corona del Rey       Fruit
>
>Basically, I want to put the city names into an array.  The first field,
>the fruit name, is always one word with no spaces.
>
>Anyone know how to do that ?


Well, there are many ways.  You could split the string on whitespace,
remove the first and last elements, and join the others with spaces:

  for (@data) {
    my @fields = split;
    shift @fields;
    pop @fields;
    push @cities, "@fields";  # "@array" = join(" ", @array)
  }

Or, you could use a regex that gets SPECIFICALLY what you want:

  for (@data) {
    push @cities, $1 if /^\S+\s+(\S+(?:\s+\S+)*)\s+\S+$/;
  }

That regex might need a bit of explanation:

  m{
    ^                 # the beginning of the string
    \S+               # one or more non-spaces
    \s+               # one or more spaces
    (                 # capture to $1:
      \S+               # first word of the city name
      (?: \s+ \S+ )*    # *ALL* remaining words
    )
    \s+               # one or more spaces
    \S+               # one or more non-spaces
    $                 # the end of the string
  }x;

What this does on a string like "peach Georgia fruit" is this: the first
\S+\s+ matches "peach ".  Then we capture "Georgia fruit" to $1.  However,
the REST of the regex still has to match, but it can't, so the (?:\s+\S+)*
backtracks -- it gives up one of the chunks it matched, so $1 is only
"Georgia".  Then the last \s+\S+ can match " fruit".

-- 
Jeff "japhy" Pinyan      [EMAIL PROTECTED]      http://www.pobox.com/~japhy/
RPI Acacia brother #734   http://www.perlmonks.org/   http://www.cpan.org/
<stu> what does y/// stand for?  <tenderpuss> why, yansliterate of course.
[  I'm looking for programming work.  If you like my work, let me know.  ]


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Re: Need help with a regex

Reply via email to