On 19.02.25 18:14, Bernhard Voelker wrote:

On 2/18/25 7:45 PM, Rupert Gallagher via GNU coreutils Bug Reports wrote:

By comparison, human (-h) and numeric (-n) sort cause data loss:

not really.  That's the difference between
a)
  "I have a list containing numbers; I merely care about numbers and want to get a unique, sorted list of them."
  ('sort -h -u')

and
b)
  "I have a list containing numbers; I want to have it sorted by numbers, and then throw away duplicates."
  ('sort -h | uniq')

The point is: in case a), the numerical value of each non-number entry is Zero.


I have no issue with the way 'sort -u' is currently working, but the man page isn't clear at all about the fact that 'sort -h -u' and 'sort -h | uniq' behave differently.

Specifically, the explanation for -u

-u, --unique
             with -c, check for strict ordering; without -c, output only the first of an equal run

does not provide any explanation what 'equal' or 'run' may mean. Maybe add something like "where equality is assessed only based on the keys and rules used to sort the output".


Rainer




Reply via email to