On Friday 07 November 2008 14:56:34 Patrick R. Michaud (via RT) wrote: > There's a bug somewhere in the escape opcode > (r32442, no libicu present). Here's the test case: > > $ cat y.pir > .sub main > $S0 = unicode:"x/\u0445\u0440\u0435\u043d\u044c_09-10.txt" > say $S0 > $S1 = escape $S0 > say $S1 > .end > > $ ./parrot y.pir > x/хрень_09-10.txt > x/\u0445\u0440\u0435\u043d\u044c9-10.txt > > We start by constructing a unicode string (originally from RT #58820) > and displaying it, then we escape the string and display that. > The escaped version should be the same as what appears in the > quotes in the unicode:"..." literal, but as you can see above > the "_0" characters present in the original string are lost in > the escaped version. A hex dump shows that they are being turned > into NUL bytes somehow: > > $ ./parrot y.pir | xxd > 0000000: 782f d185 d180 d0b5 d0bd d18c 5f30 392d x/.........._09- > 0000010: 3130 2e74 7874 0a78 2f5c 7530 3434 355c 10.txt.x/\u0445\ > 0000020: 7530 3434 305c 7530 3433 355c 7530 3433 u0440\u0435\u043 > 0000030: 645c 7530 3434 6300 0039 2d31 302e 7478 d\u044c..9-10.tx > 0000040: 740a t. > $ > > This bug appears to be very sensitive to the contents of this > paritcular string -- adding, removing, or otherwise changing the > string contents causes the bug to disappear.
Fixed in r32444, with your code turned into a test. Thanks! (This also fixes RT #58820). -- c