Re: Rakudo and non-ASCII character classes in rules

2010-05-25 Thread Patrick R. Michaud
On Tue, May 25, 2010 at 10:56:06AM +0200, Moritz Lenz wrote: > 10:53 <@moritz_> rakudo: say 'c' ~~ /<[\x03c0]>/ > 10:53 <+p6eval> rakudo 10a321: OUTPUT«c␤» > 10:55 <@moritz_> rakudo: say '0' ~~ /<[\x03c0]>/ > 10:55 <+p6eval> rakudo 10a321: OUTPUT«0␤» Actually, I don't think Rakudo understands \x n

Re: Rakudo and non-ASCII character classes in rules

2010-05-25 Thread Mark J. Reed
Note that per spec, \x w/o brackets should eat as many characters as look like hex digits. If that means the code point is outside the range of valid Unicode, then it should throw an error. On Tuesday, May 25, 2010, Moritz Lenz wrote: > > > Am 24.05.2010 08:40, schrieb Aaron Sherman: > > I came u

Re: Rakudo and non-ASCII character classes in rules

2010-05-25 Thread Moritz Lenz
Am 24.05.2010 08:40, schrieb Aaron Sherman: I came up with these tests which I though should work: ok("π" ~~ /<[π]>/, "π as a character class"); ok("π" ~~ /<[\x03c0]>/, "π as a character class (hex)"); ok("π" ~~ /<[\x0391 .. \x03c9]>/, "π in a character class range"); ok("π" ~~ /\w/, "π as a w

Re: Rakudo and non-ASCII character classes in rules

2010-05-25 Thread Aaron Sherman
On Mon, May 24, 2010 at 2:40 AM, Aaron Sherman wrote: > Which I tried to translate as: > >        token ucschar { >            <+[\xA0 .. \xD7FF] + [\xF900 .. \xFDCF] + [\xFDF0 .. \xFFEF] + >            [\x1 .. \x1FFFD] + [\x2 .. \x2FFFD] + >            [\x3 .. \x3FFFD] + [\x4 .. \

Rakudo and non-ASCII character classes in rules

2010-05-25 Thread Aaron Sherman
I came up with these tests which I though should work: ok("π" ~~ /<[π]>/, "π as a character class"); ok("π" ~~ /<[\x03c0]>/, "π as a character class (hex)"); ok("π" ~~ /<[\x0391 .. \x03c9]>/, "π in a character class range"); ok("π" ~~ /\w/, "π as a word character"); Of those, only the first one a