bug#34110: feature request: dual-column du output, showing "real" and "on-disk" sizes (and about that "apparent-size" concept)

René J . V . Bertin Thu, 17 Jan 2019 02:14:18 -0800

On Wednesday January 16 2019 16:06:50 Assaf Gordon wrote:

Hello,


Yes, I used the exact same directory in all comparisons. It's a nodejs cache 
(or whatever) directory as you may have guessed; I picked it because it's a 
good example of the sort of directory found these days which can create 
considerable overhead. Small enough it'd tend to get dismissed as significant, 
but containing a large number of files (almost 8000 in my case), most of them 
tiny.

>I hope this helps to clarify "apparent-size".

Yes and no :) I understand what "apparent-size" does (and have dug through the 
code looking for ideas how to do similar things in one of my own apps).

My whole point is that there might be a better name. I know one should 
distinguish every-day language and technical terms but if the latter start to 
appear (pun intended) like the former (and lack a shorthand) then they'd best 
be chosen such that they don't require thinking about their interpretation.

Paul's comment about not being able to know what happens underneath only makes 
this argument stronger IMHO. On the one hand, du can only report how big a item 
would appear to be on disk (based on what stat() reports). In addition, how 
would it handle knowledge about the number of disks that a given file is 
written to? On the other hand, the actual content size is a given that 
shouldn't change and that is not subject to any existential questions. (Though 
as my examples show, this isn't necessarily true when du'in directories, and 
esp. so for HFS+ with compression.)

I realise that you cannot really call the content size observable "real size" 
when reporting from a disk-usage viewpoint, but "content size" (--content-size, 
-C) should be clear enough? "Estimated on-disk size" would be good enough as a 
header for the other observable (an estimate can be 100% accurate after all).

Cheers,
R.

bug#34110: feature request: dual-column du output, showing "real" and "on-disk" sizes (and about that "apparent-size" concept)

Reply via email to