On Sat, Jan 02, 2016 at 09:10:27PM -0500, Thomas Dickey wrote:
> On Sat, Jan 02, 2016 at 06:46:52PM +0000, Sandro Tosi wrote:
> > Adding Thomas, the upstream author of diffstat
> > 
> > Hey Thomas,
> > what's your take on this report?
> > 
> > On Sun, Aug 2, 2015 at 6:39 PM, Josh Triplett <[email protected]> wrote:
> > > On Sat, Aug 01, 2015 at 06:33:39PM -0700, Josh Triplett wrote:
> > >> If the diff contains files with very large diffs, diffstat's automatic
> > >> scaling can cause files with small diffs to display zero '-' or '+'
> > >> characters.  This hides key information from the diffstat, namely
> > >> the direction of the diff.  diffstat should always display at least one
> > >> '-' for a file with lines removed, and at least one '+' for a file with
> > >> lines added, regardless of scaling.
> 
> It would introduce more confusion than the existing situation, by
> adding noise to the (of course) rounded display.

I understand that the histogram necessarily quantizes to an integer
number of characters; however, rounding small numbers down to zero
characters hides useful information, namely whether the file had deleted
lines, added lines, or both.  By contrast, rounding any non-zero number
up to at least 1 character makes the histogram marginally
disproportional; however, in any circumstance where that might happen,
some other line must go all the way across the screen, so the
disproportionality seems extremely marginal.  It seems rather unlikely
to me that people would rely on the exactness of the rounding, rather
than just the rough proportions.

By contrast, "git diff --stat" in the same example I gave produces at
least one symbol:

 /dev/null => 2/big-file  | 100000 ++++++++++++++++++++++++++++++++++++++++++++
 1/only-in-1 => /dev/null |      1 -
 /dev/null => 2/only-in-2 |      1 +
 3 files changed, 100001 insertions(+), 1 deletion(-)

That output seems more helpful to me: it still conveys a sense of
relative scale, while also noting removal and insertion.

For that matter, if I add a single line to 1/big-file, I get the
following output *from diffstat*:

$ diff -Naur 1 2 | diffstat
 big-file  |100001 
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 only-in-1 |    1 
 only-in-2 |    1 
 3 files changed, 100001 insertions(+), 2 deletions(-)

Notice in this case the '-' at the end.  If diffstat strictly followed
proportionality and rounding in its histogram, that single deleted line
would round to zero characters, as it does for only-in-1.  So why does
diffstat round the single-line change up to a single character in this
case, but down to zero characters in the case of the other two files?

- Josh Triplett

Reply via email to