On Mon, Aug 04, 2003 at 17:18:58 +0300, Ruslan Ermilov wrote: > : The characters or collating elements in the > : range shall be placed in the array in ascending > : collation sequence. If the second endpoint > : precedes the starting endpoint in the collation > : sequence, it is unspecified whether the range
Do you read first part about collation sequence? We just implement that, i.e. collation sequence for all, including non-POSIX locale which allowed as unspecified. > : of collating elements is empty, or this construct > : is treated as invalid. In locales other than > ^^^^^^^^^^^^^^^^^^^^^ > : the POSIX locale, this construct has unspecified > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > : behavior. > ^^^^^^^^ > > This is identical to a similar issue with awk(1), and the latest > snapshot of the One True AWK reverts to NOT using strcoll(3) to > handle character ranges in RE, because different locales and even > the same locales on different operating systems (FreeBSD, Linux, > and Solaris were compared) have different ideas about the collating > order. On Linux, the German locale's collating sequence will be > ``A a ... B b'', while on FreeBSD, it's ``A B ... a b''. This is bug in AWK, since strcoll() required in regexp, but we don't discuss AWK. Even in case it is unspecified behaviour, it means that 1) We can't use c-c for non-POSIX locales! 2) All occurances of c-c must be either replaced or used in C locale only! In other words, you win nothing, insisting on historycal behaviour, because its usage is ILLEGAL in anycase (i,e, outside of LANG=C) > So I'd rather prefer if we revert to the old behavior in tr(1). No way. The ranges should be similar with what we have for regexp.
pgp00000.pgp
Description: PGP signature