On Sat, Nov 10, 2007 at 09:11:27PM +1300, Richard Toohey wrote: > On 10/11/2007, at 10:05 AM, Daniel Ouellet wrote: > > >Otto Moerbeek wrote: > >>stat -s gives the raw info in one go. Some shell script hacking > >>should > >>make it easy to detect sparse files. > > > >Thanks Otto for the suggestion. That might help until it can be > >address for good. It would help speed up some of it. (;> > > > > This looked interesting (curiosity killed the cat?), so I started > looking at sparse files (not heard of them before.) > > Is this a sparse file?
yes. > > # dd if=/dev/zero of=sparsefile bs=1024 seek=10240 count=0 > 0+0 records in > 0+0 records out > 0 bytes transferred in 0.000 secs (0 bytes/sec) > # ls -lh > [--cut--] > -rw-r--r-- 1 root wheel 10.0M Nov 11 08:43 sparsefile > # du -hsc sparsefile > 32.0K sparsefile > 32.0K total > # du sparsefile > 64 sparsefile > # stat -s sparsefile > st_dev=7 st_ino=51969 st_mode=0100644 st_nlink=1 st_uid=0 st_gid=0 > st_rdev=0 st_size=10485760 st_atime=1194723829 st_mtime=1194723829 > st_ctime=1194723829 st_blksize=16384 st_blocks=64 st_flags=0 > > So because blocks allocated = 64, and block size is (usually) 512 > bytes => file is 32K (but ls and others will report 10Mb size.) > > So if you scanned whatever director(y|ies) you are interested in, > > If st_size > (st_blocks * 512) Then > *** this may be a sparse file? > > (BUT - blocksize of 16384 is reported so I must be missing something?) yeah, look at stat(2): int64_t st_blocks; /* blocks allocated for file */ u_int32_t st_blksize; /* optimal file sys I/O ops blocksize */ actually st_blocks's unit is disk sectors, to be precise. I don't read perl, so I cannot comment on the script below. -Otto > > A stab at it in Perl (lifted from Perl Cookbook): > > use strict; > use warnings; > use File::Find; > sub process_file { > my $f=$File::Find::name; > (my $dev,my $ino,my $mode,my $nlink,my $uid,my $gid,my > $rdev,my $size,my $atime,my $mtime,my $ctime,my $blksize,my $blocks) > =sat($f); > if ($blocks * 512 < $size) { > print "\t$f => SZ: $size BLSZ: $blksize BLKS: $blocks > \n"; > print "\t" . -s $f; > print "\n"; > } > } > find(\&process_file,("/home/sparse-files")); > > The output is: > > # perl check.pl > /home/sparse-files/sparsefile => SZ: 10485760 BLSZ: 16384 > BLKS: 64 > 10485760 > > Thanks.