On 9/28/23 04:22, Pádraig Brady wrote:
-n, --numeric-sort compare according to string numerical value.
leading blanks, negative sign, decimal
point,
and thousands separators are supported.
Although a valiant effort this is likely to cause other trouble, as it
uses multiple terms (blanks, decimal point, thousands separator) without
explanation, and it omits the role of the locale. I suggest instead that
we simply say "see the manual", and tighten up the manual to explain
these and, while we're at it, other things (e.g., -0 vs 0).
I gave that a shot by installing the attached.
PS to Jorge: Changing behavior as you suggested would likely cause
trouble, as many programs depend on the current behavior, which is
standardized by POSIX here:
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sort.html#tag_20_119_04
From a2434d3e58e8ead6c4c92fd989da32fe648e1545 Mon Sep 17 00:00:00 2001
From: Paul Eggert <egg...@cs.ucla.edu>
Date: Thu, 28 Sep 2023 18:02:25 -0700
Subject: [PATCH] sort: improve --help
Problem reported by Jorge Stolfi (bug#66253).
* src/sort.c (usage): Suggest looking at the manual for -n details.
---
doc/coreutils.texi | 12 ++++++++----
src/sort.c | 3 ++-
2 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index ee3b1ce11..be4b610be 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -4678,18 +4678,22 @@ can change this.
@opindex --numeric-sort
@opindex --sort
@cindex numeric sort
+@vindex LC_CTYPE
@vindex LC_NUMERIC
Sort numerically. The number begins each line and consists
of optional blanks, an optional @samp{-} sign, and zero or more
digits possibly separated by thousands separators, optionally followed
by a decimal-point character and zero or more digits. An empty
-number is treated as @samp{0}. The @env{LC_NUMERIC}
-locale specifies the decimal-point character and thousands separator.
-By default a blank is a space or a tab, but the @env{LC_CTYPE} locale
-can change this.
+number is treated as @samp{0}. Signs on zeros and leading zeros do
+not affect ordering.
Comparison is exact; there is no rounding error.
+The @env{LC_CTYPE} locale specifies which characters are blanks and
+the @env{LC_NUMERIC} locale specifies the thousands separator and
+decimal-point character. In the C locale, spaces and tabs are blanks,
+there is no thousands separator, and @samp{.} is the decimal point.
+
Neither a leading @samp{+} nor exponential notation is recognized.
To compare such strings numerically, use the
@option{--general-numeric-sort} (@option{-g}) option.
diff --git a/src/sort.c b/src/sort.c
index abee57d7a..5c86b8332 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -444,7 +444,8 @@ Ordering options:\n\
-h, --human-numeric-sort compare human readable numbers (e.g., 2K 1G)\n\
"), stdout);
fputs (_("\
- -n, --numeric-sort compare according to string numerical value\n\
+ -n, --numeric-sort compare according to string numerical value;\n\
+ see manual for which strings are supported\n\
-R, --random-sort shuffle, but group identical keys. See shuf(1)\n\
--random-source=FILE get random bytes from FILE\n\
-r, --reverse reverse the result of comparisons\n\
--
2.41.0