On 2/18/25 7:45 PM, Rupert Gallagher via GNU coreutils Bug Reports wrote:
By comparison, human (-h) and numeric (-n) sort cause data loss:
not really. That's the difference between a) "I have a list containing numbers; I merely care about numbers and want to get a unique, sorted list of them." ('sort -h -u') and b) "I have a list containing numbers; I want to have it sorted by numbers, and then throw away duplicates." ('sort -h | uniq') The point is: in case a), the numerical value of each non-number entry is Zero. Consider the following: $ printf "%s\n" 0 1 X-1 Ab2 3 ma | LC_ALL=C sort -nu 0 1 3 Here, the entries 0, "X-1", "Ab2" and "ma" all have the numerical value 0. That's why the first Zero is output. Now let's remove the literal/numerical 0 from the input: $ printf "%s\n" 1 X-1 Ab2 3 ma | LC_ALL=C sort -nu X-1 1 3 Now, the first entry which represents numerically 0 is "X-1". Now even let's put the 0 back into the input, but at the end: $ printf "%s\n" 1 X-1 Ab2 3 ma 0 | LC_ALL=C sort -nu X-1 1 3 Still, sort(1) outputs the first entry which has a numerical value of Zero: "X-1". Have a nice day, Berny