Mike: On Thu, Jun 25, 2015 at 5:42 AM, Mike Martin <m...@redtux.org.uk> wrote: > Hi > I am currently getting issues with regexes that use the qr operator. > > The results with qr are different than without. This is a small sample > program to illustrate it > > use strict; > > my $str='Database Administrator'; > my $pattern= > '(?=^(?:(?!(?:datab|network|system)).)*$).*(?:Adm(?:in(?:i?strat(?:ors?|ion|ive)?)?)?|Clerk\b|Clerical|office)(?:.*?ass.*?|supp|temp|staff|officer?|).*?' > ; > > my $pattern2=qr/$pattern/; > $str=~/($pattern)/i; > print 'no qr',"\t", $1,"\n"; > $str=~/($pattern2)/i; > print 'with qr',"\t", $1,"\n"; > > print 'Pattern without qr',"\t",$pattern,"\n"; > print 'Pattern without qr',"\t",$pattern2,"\n"; >
- - - >8 - - - *snip* - - - 8< - - - > The only difference I can see is the addition of a non-capturing group > around the expression (?^:) > > Anyone any idea what is happening here If you check out the documentation on (?adlupimsx-imsx) then you will know that it allows activating and deactivating modifiers from within the regex. For example, (?i) is equivalent to /i for the remainder of the pattern. A caret after the ? has an implied meaning: perldoc perlre (Strawberry Perl 5.20.2) said: > Starting in Perl 5.14, a "^" (caret or circumflex accent) > immediately after the "?" is a shorthand equivalent to "d-imsx". > Flags (except "d") may follow the caret to override it. But a minus > sign is not legal with it. Directly beneath this is the explanation for (?:pattern), (?adluimsx-imsx:pattern), and (?adluimsx-imsx:pattern). These are for "clustering". They apply the specified modifiers to pattern. You can look at `perldoc perlre' for the specifics of what this means, but the '-' means to deactivate and the i means case-insensitive matching as per usual. I believe this is the problem. If you notice your usage of qr// above you aren't using the /i modifier with the qr//. You should be since that is what you meant, and that is where you meant it. If you do include the /i modifier on the qr// then the resulting pattern changes thusly: use strict; use warnings; my $pattern = 'foo'; my $regex_with = qr/$pattern/i; my $regex_without = qr/$pattern/; print "with: ", $regex_with, "\n"; print "without: ", $regex_without, "\n"; __END__ Output: with: (?^i:foo) without (?^:foo) With the modifier in qr// the i is added back to the pattern. Without it is left off. The modifiers on the m// won't have any affect to the enclosed pattern. An alternative solution is to include the modifier directly in the regex by prefixing it with (?i). For example, if the patterns are dynamic (user-supplied). One can conclude that it works this way so that what you specified in the qr// is preserved regardless of what m// says (if your qr// is case-sensitive then so is the pattern embedded into m//, regardless of what applies to m// as a whole). Regards, -- Brandon McCaig <bamcc...@gmail.com> <bamcc...@castopulence.org> Castopulence Software <https://www.castopulence.org/> Blog <http://www.bambams.ca/> perl -E '$_=q{V zrna gur orfg jvgu jung V fnl. }. q{Vg qbrfa'\''g nyjnlf fbhaq gung jnl.}; tr/A-Ma-mN-Zn-z/N-Zn-zA-Ma-m/;say' -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/