On Fri, 09 Jan 2015 18:24:46 -0500 random...@fastmail.us wrote: > Even if octal values could be more than three digits, I have no idea > what you think 50102 is. Its decimal value is 20546. Its hex value is > 0x5042. I have no idea what it has to do with character U+00F6 whose > UTF-8 representation is 0xC3 0xB6..... I just realized what you're > doing, 0xC3B6 has the _decimal_ value 50102, I have no idea why you > would think _that_ is a representation people would want to use. If > you're so pro-unicode, make it accept \u00F6 - that's a valid extension. > But reusing the syntax POSIX uses for three-digit octal literals, for > arbitrarily long decimal literals that aren't even unicode code points, > makes no sense at all. In what universe is that intuitive?
C3B6 is 'ö' and makes sense to allow specifying it as \50102 (in the pure UTF-8-sense of course, nothing to do with collating). > Collating elements = POSIX forbids them = You don't want them anyway. > Multibyte characters = POSIX allows/requires them = You like them too. > What is the problem? > I don't know what you want to do that you think POSIX doesn't allow. Well, probably I misunderstood the matter. Sometimes this stuff gets above my head. ;) At the end of the day, you want software to work as expected: GNU tr: $ echo ελληνική | tr [α-ω] [Α-Ω] ®®®®®®®®® our tr: $ echo ελληνικη | ./tr [α-ω] [Α-Ω] ΕΛΛΗΝΙΚΗ Cheers FRIGN -- FRIGN <d...@frign.de>