On 04/05/2013 14:26, Florian Huber wrote:
I'm parsing a logfile and don't quite understand the behaviour of m//. From a previous regex match I have already captured $+{'GFP'}: use strict; use warnings; (...) $text =~ m/ (?<GFP>FILTERS .*? WRT)/x; # I simply have my whole logfile in $text - I know there are better solutions. print $+{'GFP'}, "\n"; prints this: FILTERS GFP,GFP,100% ACTSHUT 17 AVG 4,1.000000 WRT Now I want to go on parsing $+{'GFP'}. To be more precise, I want to capture "AVG 4": If I do: $+{'GFP'} =~ m/(?<AVG>AVG\s\d); print "\$+{'GFP'} is $+{'GFP'}.\n"; I get a warning that I'm using an uninitialised value and: $+{'GFP'} is . $+{'AVG'} is AVG 4. So first question: Apparently some return value of $+{'GFP'} =~ m/PATTERN/; messes with this hash value. So from what I can remember, the return value will indicate if the substitution was successful or not. Then why don't I get some value like 0 or 1 in $+{'GFP'} but just an uninitialised value? If the match is not successful, $+{'GFP'} will stay untouched: $+{'GFP'} =~ m/(?<AVG>AVG\s\d nothere); print "\$+{'GFP'} is $+{'GFP'}.\n"; print "\$+{'AVG'} is $+{'AVG'}.\n"; will print: $+{'GFP'} is FILTERS GFP,GFP,100% ACTSHUT 17 AVG 4,1.000000 WRT. $+{'AVG'} is . Not surprisingly, $+{'AVG'} is uninitialised here. It didn't quite make sense to me but I figured that the problem might be that m// in list context returns a list of the capture variables created in the match. So I tried: $+{'GFP'} =~ scalar m/(?<AVG>AVG\s\d)/; print "\$+{'GFP'} is $+{'GFP'}.\n"; print "\$+{'AVG'} is $+{'AVG'}.\n"; prints this: $+{'GFP'} is FILTERS GFP,GFP,100% ACTSHUT 17 AVG 4,1.000000 WRT. $+{'AVG'} is . So now the match suddenly fails?!?
Hello Florian First a couple of points - Don't use named captures for simple regexes like this. They make the code harder to understand, and are really only useful when using complex patterns with multiple captures - The built-in variables that relate to regular expressions are modified by every successful pattern match. It is safer to save values that you may want to use later in a sperate variable. In particular, your regex m/(?<AVG>AVG\s\d)/ matches and, because there is no capture named `GFP` it sets the corresponding element of %+ to undef. However $+{AVG} is now set, as you did have a capture with that name. The pattern match $+{'GFP'} =~ m/(?<AVG>AVG\s\d); is in void context (i.e. the result is being discarded. That is mostly equivalent to scalar context as far as operator behaviour is concerned. And you have made things worse by writing $+{'GFP'} =~ scalar m/(?<AVG>AVG\s\d)/; which is equivalent to $+{'GFP'} =~ ($_ =~ /(?<AVG>AVG\s\d)/); so it applies the pattern to the $_ variable, and uses the resut of that match as another regex and applies that to $+{GFP}. It would help to be able to see the format of your input data. If you know that reading the entire file in is a bad idea then you shouldn't be doing it. This short piece of code does what you need, but I am sure there is a better way. $text =~ m/(FILTERS.*?WRT)/; my $gfp = $1; $gfp =~ m/(AVG\s+\d)/; my $avg = $1; HTH, Rob -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/