Re: buildworld broken after installworld

Andrey Chernov Mon, 04 Aug 2003 07:42:10 -0700

On Mon, Aug 04, 2003 at 17:18:58 +0300, Ruslan Ermilov wrote:

> :         The characters or collating elements in the
> :         range shall be placed in the array in ascending
> :         collation sequence. If the second endpoint
> :         precedes the starting endpoint in the collation
> :         sequence, it is unspecified whether the range


Do you read first part about collation sequence? We just implement that,
i.e. collation sequence for all, including non-POSIX locale which allowed
as unspecified.

> :         of collating elements is empty, or this construct
> :         is treated as invalid. In locales other than
>                                  ^^^^^^^^^^^^^^^^^^^^^
> :         the POSIX locale, this construct has unspecified
>           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> :         behavior.
>           ^^^^^^^^
> 
> This is identical to a similar issue with awk(1), and the latest
> snapshot of the One True AWK reverts to NOT using strcoll(3) to
> handle character ranges in RE, because different locales and even
> the same locales on different operating systems (FreeBSD, Linux,
> and Solaris were compared) have different ideas about the collating
> order.  On Linux, the German locale's collating sequence will be
> ``A a ... B b'', while on FreeBSD, it's ``A B ... a b''.

This is bug in AWK, since strcoll() required in regexp, but we don't
discuss AWK. Even in case it is unspecified behaviour, it means that

1) We can't use c-c for non-POSIX locales!
2) All occurances of c-c must be either replaced or used in C locale only!

In other words, you win nothing, insisting on historycal behaviour,
because its usage is ILLEGAL in anycase (i,e, outside of LANG=C)

> So I'd rather prefer if we revert to the old behavior in tr(1).

No way. The ranges should be similar with what we have for regexp.

pgp00000.pgp
Description: PGP signature

Re: buildworld broken after installworld

Reply via email to