On Sat, Jan 02, 2016 at 09:10:27PM -0500, Thomas Dickey wrote: > On Sat, Jan 02, 2016 at 06:46:52PM +0000, Sandro Tosi wrote: > > Adding Thomas, the upstream author of diffstat > > > > Hey Thomas, > > what's your take on this report? > > > > On Sun, Aug 2, 2015 at 6:39 PM, Josh Triplett <[email protected]> wrote: > > > On Sat, Aug 01, 2015 at 06:33:39PM -0700, Josh Triplett wrote: > > >> If the diff contains files with very large diffs, diffstat's automatic > > >> scaling can cause files with small diffs to display zero '-' or '+' > > >> characters. This hides key information from the diffstat, namely > > >> the direction of the diff. diffstat should always display at least one > > >> '-' for a file with lines removed, and at least one '+' for a file with > > >> lines added, regardless of scaling. > > It would introduce more confusion than the existing situation, by > adding noise to the (of course) rounded display.
I understand that the histogram necessarily quantizes to an integer number of characters; however, rounding small numbers down to zero characters hides useful information, namely whether the file had deleted lines, added lines, or both. By contrast, rounding any non-zero number up to at least 1 character makes the histogram marginally disproportional; however, in any circumstance where that might happen, some other line must go all the way across the screen, so the disproportionality seems extremely marginal. It seems rather unlikely to me that people would rely on the exactness of the rounding, rather than just the rough proportions. By contrast, "git diff --stat" in the same example I gave produces at least one symbol: /dev/null => 2/big-file | 100000 ++++++++++++++++++++++++++++++++++++++++++++ 1/only-in-1 => /dev/null | 1 - /dev/null => 2/only-in-2 | 1 + 3 files changed, 100001 insertions(+), 1 deletion(-) That output seems more helpful to me: it still conveys a sense of relative scale, while also noting removal and insertion. For that matter, if I add a single line to 1/big-file, I get the following output *from diffstat*: $ diff -Naur 1 2 | diffstat big-file |100001 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- only-in-1 | 1 only-in-2 | 1 3 files changed, 100001 insertions(+), 2 deletions(-) Notice in this case the '-' at the end. If diffstat strictly followed proportionality and rounding in its histogram, that single deleted line would round to zero characters, as it does for only-in-1. So why does diffstat round the single-line change up to a single character in this case, but down to zero characters in the case of the other two files? - Josh Triplett

