On 04/05/2013 14:26, Florian Huber wrote:

I'm parsing a logfile and don't quite understand the behaviour of m//.

 From a previous regex match I have already captured $+{'GFP'}:

use strict;
use warnings;

(...)

$text =~ m/ (?<GFP>FILTERS .*? WRT)/x;    # I simply have my whole
logfile in $text - I know there are better solutions.
print $+{'GFP'}, "\n";

prints this:
FILTERS GFP,GFP,100% ACTSHUT 17 AVG 4,1.000000 WRT

Now I want to go on parsing $+{'GFP'}. To be more precise, I want to
capture "AVG 4":

If I do:

$+{'GFP'} =~ m/(?<AVG>AVG\s\d);
print "\$+{'GFP'} is $+{'GFP'}.\n";

I get a warning that I'm using an uninitialised value and:

$+{'GFP'} is .
$+{'AVG'} is AVG 4.

So first question: Apparently some return value of $+{'GFP'} =~
m/PATTERN/; messes with this hash value. So from what I can remember,
the return value will indicate if the substitution was successful or
not. Then why don't I get some value like 0 or 1 in $+{'GFP'} but just
an uninitialised value?

If the match is not successful, $+{'GFP'} will stay untouched:

  $+{'GFP'} =~ m/(?<AVG>AVG\s\d nothere);
print "\$+{'GFP'} is $+{'GFP'}.\n";
print "\$+{'AVG'} is $+{'AVG'}.\n";

will print:
$+{'GFP'} is FILTERS GFP,GFP,100% ACTSHUT 17 AVG 4,1.000000 WRT.
$+{'AVG'} is .

Not surprisingly, $+{'AVG'} is uninitialised here.

It didn't quite make sense to me but I figured that the problem might be
that m// in list context returns a list of the capture variables created
in the match. So I tried:

$+{'GFP'} =~ scalar m/(?<AVG>AVG\s\d)/;
print "\$+{'GFP'} is $+{'GFP'}.\n";
print "\$+{'AVG'} is $+{'AVG'}.\n";

prints this:
$+{'GFP'} is FILTERS GFP,GFP,100% ACTSHUT 17 AVG 4,1.000000 WRT.
$+{'AVG'} is .

So now the match suddenly fails?!?

Hello Florian

First a couple of points

- Don't use named captures for simple regexes like this. They make the
code harder to understand, and are really only useful when using complex
patterns with multiple captures

- The built-in variables that relate to regular expressions are modified
by every successful pattern match. It is safer to save values that you
may want to use later in a sperate variable. In particular, your regex
m/(?<AVG>AVG\s\d)/ matches and, because there is no capture named `GFP`
it sets the corresponding element of %+ to undef. However $+{AVG} is now
set, as you did have a capture with that name.

The pattern match

   $+{'GFP'} =~ m/(?<AVG>AVG\s\d);

is in void context (i.e. the result is being discarded. That is mostly
equivalent to scalar context as far as operator behaviour is concerned.
And you have made things worse by writing

    $+{'GFP'} =~ scalar m/(?<AVG>AVG\s\d)/;

which is equivalent to

    $+{'GFP'} =~ ($_ =~ /(?<AVG>AVG\s\d)/);

so it applies the pattern to the $_ variable, and uses the resut of that
match as another regex and applies that to $+{GFP}.

It would help to be able to see the format of your input data. If you
know that reading the entire file in is a bad idea then you shouldn't be
doing it.

This short piece of code does what you need, but I am sure there is a
better way.

    $text =~ m/(FILTERS.*?WRT)/;
    my $gfp = $1;

    $gfp =~ m/(AVG\s+\d)/;
    my $avg = $1;

HTH,

Rob

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to