John W. Krahn wrote:
Richard Lee wrote:
Took your advice and start to read 'Mastering regular expression' by
Jeffrey E.F.Friedl,
Can you explain below further?
on page, 205
push(@fields, $+) while $text =~ m{
"([^\"\\]*(?:\\.[^\"\\]*)*)",? #standard quoted string(with
possible comma)
| ([^,]+),? #or up to next
comman(with possible comma)
| ,
}gx;
I am not totally understanding how the first line is matching
standard quoted string.
I understand " " in beginning and end and option ,? at the
very end.
now why ---> [^\"\\]* , should read anything except \ and " and \\
? why?
and then followed by pretty much samething(in grouping only option
?:) followed by \\ ?? and samething?
How is that suppose to match quoted string? such as "hi, how are you",
push(@fields, $+) while $text =~ m{
" # match a " character
( # start capturing to $1
[^\"\\]* # match any character not " or \ zero or more
times
(?: # start grouping
\\. # match \ followed by any character
[^\"\\]* # match any character not " or \ zero or more
times
)* # end grouping, match group zero or more times
) # end capturing to $1
" # match a " character
,? # match a , zero or one times
| # OR
( # start capturing to $2
[^,]+ # match any character not , one or more times
) # end capturing to $2
,? # match a , zero or one times
| # OR
, # match a , character
}gx;
So with a string like '"hi, how are you"' '"' will match the '"' at
the beginning, '[^\"\\]*' will match 'hi, how are you' and '"' will
match the '"' at the end.
With a string like '"\"\t\""' '"' will match the '"' at the beginning,
'[^\"\\]*' will match between the '"' and the '\' (zero times),
'(?:\\.[^\"\\]*)*' will match '\"' and then '\t' and then '\"' and '"'
will match the '"' at the end.
Also, how does this work?
defined($1) ? $1 : $3; This ternary reads if $1 is true, then test
to see if $1 is defined? IF $1 is not true, then $3 is defined?(or ??)
That expression is in void context so I assume it is used as an rvalue
expression and not an lvalue expression? Actually it has to be an
rvalue expression because $1 and $3 are read only variables.
The variable $1 is tested for the value 'undef'. If it contains
anything other than 'undef' then $1 is returned else $3 is returned.
This does not determine whether $1 is true or false.
John
Thanks John for detailed explanation!!
I just dont' understand why \ (also didn't know that within [ ], \ has
to be escaped.) needs to be watch out for within " " ..
But I have better understanding now.!!
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/