Bug#436338: df: df rounds sizes incorrectly

Paul E Condon Thu, 23 Dec 2010 11:51:29 -0800

On 20101223_002101, Bob Proulx wrote:
> Paul E Condon wrote:
> > I rather like the idea of having a 5% safety allowance, and having %use
> > report 100% when there is still 25GB available on a 500GB disk. That is
> > explained somewhere and is easy to understand and appreciate. But rm id
> > SLOW on these big disks. I've been watching the progress of rm more
> > often than I'd like,
> 
> I have been noticing that the ext4 w/ fsync fiasco is making
> everything very much slower while saying that it is trying to make
> things faster.  The irony is tragic.  I don't know if that is what you
> are suffering from but it is potentially possible.
> 
> > and I noticed that my mental extrapolations of when the process
> > would be done weren't giving the correct answer, and it was because
> > of this bug, so I report.
> 
> Sometimes it is a bug.  Sometimes it is not.  Thank you for the report
> just the same.
> 
> In any case I apologize for not spending the time to completely
> understand your report before sending my reply.  So often people don't
> take minfree into consideration and so I pointed to the FAQ on the
> topic.  While the other numbers are just reported from statfs the use
> percentage is calculated.  Sorry for getting ahead of myself.
> 
> I agree with your analysis that used / (used + available) in your case
> of 310726424 / ( 310726424 + 144929512 ) = 0.68193 as you reported
> which is not equal to the 69% that the tool emitted.
> 
> I looked at the code and if I am following the correct code path then
> it is basically doing the following:
> 
>   used = f_blocks
>   available_to_root = f_bfree
>   available = f_bavail
>   nonroot_total = used + available
>   u100 = used * 100
>   pct = u100 / nonroot_total + (u100 % nonroot_total != 0)
> 
> Knowing the values returned from the statfs system call would fill in
> the values for f_blocks, f_bfree, and f_bavail and should allow us to
> know how this calculation is processed.
> 
> Again, my apologies for not fully understanding the nature of your bug
> report at that time.


No problem about delayed understanding. And the bug really is minor. I've
been puzzling about it for a LOOOONG time while waiting for rm to complete.
Finally did enough careful observation to convince myself that it was real.

Actually, I don't use ext4. I still using ext3. A few months ago I thought
I had a problem with ext3, but the symptoms disappeared while trying to
document it, about the time I throw away a bad disk. My guess is that
that disk was corrupting something that made other disk also appear to 
be bad. But I didn't dig it out of the trash to pursue that theory.

> 
> > > To see the values that the kernel is returning to df's statfs call
> > > please run the following command and report the contents of the file.
> > > 
> > >   $ strace -v -e trace=statfs -o /tmp/df.strace.out df /dev/sde1
> > 
> > I don't have strace installed on the computer were this is happening.
> > I attempted to install but the computer crashed will running aptitude.
> 
> Instead of running aptitude (which because of your words makes me
> think it ran out of memory and got the oom killer involved) you could
> copy the strace deb over and then install it directly with dpkg -i
> which would use much less memory and very likely succeed where
> aptitude failed.  You could even help aptitude along with
> 
>   aptitude download strace
> 
> and then dpkg -i strace*.deb at that point.  Just ideas for you.
> 
> Alternatively it would be relatively easy to put together a very small
> C program that printed the results of the statfs call directly.  Or
> perhaps print it from perl's syscall interface.  Please let me know if
> you have too much trouble getting strace installed and I will suggest
> something.
> 
> Bob

Before I got this email I had already done a clean install of Squeeze which
seems to have gotten the box working again. It was strange. Investigation
done before the reinstall indicated that the system clock has stopped two
days ago. Things are working much better now. 

So back to the minor bug: My thinking about strace is that it might be
overkill. At some point people operating in user space (me, in
particular, but perhaps you, also) need to trust data returned by the
kernel to a system call. Here we have two kinds of data: disk size, U,
and A are real data. But %use is the result of a trivial calculation
that uses some of these real data as input, and where the result has no
effect on the proper functioning of the kernel. I suspect that the
trivial calculation in the kernel has a silly bug. That it is done
wrong could easily go unnoticed by kernel developers. Such
calculations should not be done by the kernel. It belongs close to the
formatting code that introduces the '%' character into the output
stream, IMHO.  So, I propose that you ignore the %use number given by
the kernel, and replace it with a calculated value that is consistent
with the other numbers on the line as the line is being formatted. The
problem is more cosmetic than real. Three numbers is a row that
purport to be related by simple calculation, but are not, is --- ugly.

There is already, a situation where the data returned by the kernel is
ignored by the coreutils code: the kernel in recent years has started
returning nine orders of magnitude of sub-second precision that had not
been in the last modified time before. At least five or six of those OoM
are utterly spurious and all of them are lost when the file is written
to disk, so the coreutils code ignores them all.

Also, /tmp/df.strace.out is empty after running your suggested diagnostic.
So is the above discussion an instance of sour grapes reasoning?
-- 
Paul E Condon           
pecon...@mesanetworks.net



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Bug#436338: df: df rounds sizes incorrectly

Reply via email to