Pádraig Brady <[EMAIL PROTECTED]> writes:

> uniq can be efficient and assume LANG=C always as
> it need only care if adjacent items match or not.

I'm afraid it's not that simple.  In some locales it's possible that
two strings A and B can compare equal even though their bytes differ.
The C notation for this is (strcoll (A,B) == 0 && strcmp (A,B) != 0).

This point was addressed in IEEE Std 1003.1-2001/Cor 1-2002, item
XCU/TC1/D6/40, and it's why the current Posix spec says that the
behavior of uniq depends on LC_COLLATE.


_______________________________________________
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils

Reply via email to