Sorry for the resend mail, but it turned out that accidently typed "r" instead of "R" and I belive this may be of interes for more than just you Paul.
Paul Eggert <egg...@cs.ucla.edu> wrote: > On 01/22/2018 09:47 AM, Joerg Schilling wrote: > > we are talking about files that do not change while something like TAR > > is reading them. > > It's reasonable for a file system to reorganize itself while 'tar' is > reading files. Even if a file's contents do not change, its space > utilization might change. When I use "du" or "df" to find out about > space utilization I want the numbers now, not what they were last week, > and this is true regardless of whether I have modified the files since > last week. First: "df" output is not related to the stat() data but related only to the statvfs() data. Then, "du" output on Linux seems to have a tradition to be incorrect. I remember that reiserfs returned st_blocks based on a strange "fragment handling" that ignored the fact that st_blocks counts in multiples of DEV_BSIZE rather than in multiples os the logial block size from reiserfs. In general, there is a major change in filesystems since WOFS introduced COW 30 years ago: before (when data blocks have always been overwritten), any basic element of a filesysten was forbidden do be larger than the sector size because otherwise a system or power crash could leave the filesystem in an unrepairable state. With COW, some of the structures are now allowed to be larger than the sector size and since this includes the "inode" equivalent (called "gnode" on WOFS or "dnode" on ZFS), this structure may be larger than the disks sector size, as it may be written to the background medium before the switch to the next stable filesystem state is introduced - given that the related filesystem is organized in a way that this switch is not done by just writing the "inode" equivalent. On WOFS with it's inverted structure, a file is going to the next state by just writing the gnode to the next free gnode location. So WOFS does not allow the gnode to be larger than the sector size, unless there was an extension to allow to detect partially written gnodes as invalid. On ZFS, with a "classical" filesystem structure, the file's next state is reached by writing the dnode, the directory it is in, .... up to the uberblock. So only care needs to be taken with the way the next uberblock location is interpreted as valid. On ZFS, a dnode definitely could be larger than the sector size and in theory larger parts of the file's data could be held in the meta data. If btrfs does not do it this way, returning st_blocks == 0 for a file with DEV_BSIZE or more of data would be wrong. Your claim that reorganizing the filesystem could result in different stat() data to be returned applies only in case that the file content is moved from logically being file content to logically being file meta data. So in theory, a stat() call could first return st_blocks == 1 and later (when the filesystem knows that the new/whole data of the file fits into the meta data) return st_blocks == 0. It seems however, that btrfs behaves just the other way round. BTW: you mentioned that POSIX does not grant many things that people might believe to be required....This in special is the case for directories. POSIX does not: - require the directory link count to be it's hard link count plus the number of sub-directories. This was an artefact from a design mistake in the 1970s. - require a directory to be readable, since there is readdir() - require a directory to return a stat.st_size that depends on it's "content". - require a directory to return "." or ".." with readdir(). WOFS follows the minimal POSIX requirements for directories only. A directory is a special file with size 0 and a link count of 1 except there is an inode related link (the equivalent to a hard link) to another directory. The entries "." and ".." are understood by the filesystem's path handling routines but readdir() never returns these entries. ZFS emulates the historical directory link count from the 1970s but returns sta.st_size to be the number of entries readable by readdir(). This usually let's the historic BSD function implementation for scandir() fail, as the historic scandir implementation allocates memory based on the assumption that the minimal stat.st_size of a directory is "number of entries" * "minimal struct dirent size" for UFS. Does gtar deal correctly with these constrainst? Jörg -- EMail:jo...@schily.net (home) Jörg Schilling D-13353 Berlin joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'