Re: Regexp delimiters

2010-12-08 Thread Jonathan Pool
> c:\>perl -wE "say $^V,$^O;$_='123456789';s§3(456)7§$1§;say" > v5.12.1MSWin32 > 1245689 My equivalent that works is: perl -wE "use utf8;my \$_='123456789';s§3(456)7§§\$1§;say;" 1245689 If I stop treating this section-sign delimiter as a bracketing delimiter, it fails: perl -wE "use utf8;m

Re: Regexp delimiters

2010-12-08 Thread Jonathan Pool
> Hm, what platform and perl version? 5.8.8 and 5.12.2 on RHEL, and 5.10.0 on OS X 10.6. > c:\>perl -Mutf8 -wE >"say $^V,$^O;$_='123456789';s§3(456)7§$1§;say" > Malformed UTF-8 character (unexpected continuation byte 0xa7, > with no preceding start byte) at -e line 1. Not the same err

Re: Regexp delimiters

2010-12-08 Thread C.DeRykus
On Dec 7, 9:38 am, p...@utilika.org (Jonathan Pool) wrote: > > Well, I have no idea why it does what it does, but I can tell you how to > > make it work: > > s¶3(456)7¶¶$1¶x; > > s§3(456)7§§$1§x; > Oops, sorry, yes there is: c:\>perl -Mutf8 -wE "say $^V,$^O;$_='123456789';s§3(456)7§$1§;say"

Re: Regexp delimiters

2010-12-08 Thread C.DeRykus
On Dec 7, 9:38 am, p...@utilika.org (Jonathan Pool) wrote: > > Well, I have no idea why it does what it does, but I can tell you how to > > make it work: > > s¶3(456)7¶¶$1¶x; > > s§3(456)7§§$1§x; Oops. yes there is: c:\>perl -Mutf8 -wE "say $^V,$^O;$_='123456789'; s§3(456)7§$1§;say" Malform

Re: Regexp delimiters

2010-12-08 Thread C.DeRykus
On Dec 7, 9:38 am, p...@utilika.org (Jonathan Pool) wrote: > > Well, I have no idea why it does what it does, but I can tell you how to > > make it work: > > s¶3(456)7¶¶$1¶x; > > s§3(456)7§§$1§x; > Hm, what platform and perl version? No errors here: c:\>perl -wE "say $^V,$^O;$_='1234567

Re: Regexp delimiters

2010-12-07 Thread Jonathan Pool
> Well, I have no idea why it does what it does, but I can tell you how to make > it work: > s¶3(456)7¶¶$1¶x; > s§3(456)7§§$1§x; Amazing. Thanks very much. This seems to contradict the documentation. The perlop man page clearly says that there are exactly 4 bracketing delimiters: "()", "[]", "{

Re: Regexp delimiters

2010-12-05 Thread Brian Fraser
That's probably because you are using what I sent, rather than what the OP did: > C:\>perl -E "s§3(456)7§$1§;" > Unrecognized character \x98 in column 16 at -e line 1. > > C:\>perl -Mutf8 -E "s§3(456)7§$1§;" > Substitution replacement not terminated at -e line 1. > > C:\>perl -E "s§3(456)7§§$1§;

Re: Regexp delimiters

2010-12-05 Thread Shawn H Corey
On 10-12-05 07:38 PM, Brian Fraser wrote: You have to tell perl to use UTF-8. Add this line to the top of your script(s): use utf8; See `perldoc utf8` for more details. Hm, I don't mean to step on your toes or anything, but he is already using utf8. The problem is with some utf

Re: Regexp delimiters

2010-12-05 Thread Brian Fraser
> > You have to tell perl to use UTF-8. Add this line to the top of your > script(s): > use utf8; > > See `perldoc utf8` for more details. Hm, I don't mean to step on your toes or anything, but he is already using utf8. The problem is with some utf8 characters being interpreted as a paired delimi

Re: Regexp delimiters

2010-12-05 Thread Shawn H Corey
On 10-12-05 05:58 PM, Brian Fraser wrote: Well, I have no idea why it does what it does, but I can tell you how to make it work: s¶3(456)7¶¶$1¶x; s§3(456)7§§$1§x; For whatever reason, Perl is treating those character as an 'opening' delimiter[0], so that when you write s¶3(456)7¶$1¶;, you are te

Re: Regexp delimiters

2010-12-05 Thread Brian Fraser
Well, I have no idea why it does what it does, but I can tell you how to make it work: s¶3(456)7¶¶$1¶x; s§3(456)7§§$1§x; For whatever reason, Perl is treating those character as an 'opening' delimiter[0], so that when you write s¶3(456)7¶$1¶;, you are telling Perl that the regex part is delimited

Regexp delimiters

2010-12-05 Thread Jonathan Pool
The perlop document under "s/PATTERN/REPLACEMENT/msixpogce" says "Any non-whitespace delimiter may replace the slashes." I take this to mean that any non-whitespace character may be used instead of a slash. However, I am finding that some non-whitespace characters cause errors. For example, us