On Sat, Nov 10, 2007 at 09:11:27PM +1300, Richard Toohey wrote:
> On 10/11/2007, at 10:05 AM, Daniel Ouellet wrote:
> 
> >Otto Moerbeek wrote:
> >>stat -s gives the raw info in one go. Some shell script hacking  
> >>should
> >>make it easy to detect sparse files.
> >
> >Thanks Otto for the suggestion. That might help until it can be  
> >address for good. It would help speed up some of it. (;>
> >
> 
> This looked interesting (curiosity killed the cat?), so I started  
> looking at sparse files (not heard of them before.)
> 
> Is this a sparse file?

yes.

> 
> # dd if=/dev/zero of=sparsefile bs=1024 seek=10240 count=0
> 0+0 records in
> 0+0 records out
> 0 bytes transferred in 0.000 secs (0 bytes/sec)
> # ls -lh
> [--cut--]
> -rw-r--r--  1 root  wheel  10.0M Nov 11 08:43 sparsefile
> # du -hsc sparsefile
> 32.0K   sparsefile
> 32.0K   total
> # du sparsefile
> 64      sparsefile
> # stat -s sparsefile
> st_dev=7 st_ino=51969 st_mode=0100644 st_nlink=1 st_uid=0 st_gid=0  
> st_rdev=0 st_size=10485760 st_atime=1194723829 st_mtime=1194723829  
> st_ctime=1194723829 st_blksize=16384 st_blocks=64 st_flags=0
> 
> So because blocks allocated = 64, and block size is (usually) 512  
> bytes => file is 32K (but ls and others will report 10Mb size.)
> 
> So if you scanned whatever director(y|ies) you are interested in,
> 
>       If st_size > (st_blocks * 512) Then
>               *** this may be a sparse file?
> 
> (BUT - blocksize of 16384 is reported so I must be missing something?)

yeah, look at stat(2):

 int64_t    st_blocks;  /* blocks allocated for file */
 u_int32_t  st_blksize; /* optimal file sys I/O ops blocksize */

actually st_blocks's unit is disk sectors, to be precise.

I don't read perl, so I cannot comment on the script below.

        -Otto
> 
> A stab at it in Perl (lifted from Perl Cookbook):
> 
> use strict;
> use warnings;
> use File::Find;
> sub process_file {
>         my $f=$File::Find::name;
>         (my $dev,my $ino,my $mode,my $nlink,my $uid,my $gid,my  
> $rdev,my $size,my $atime,my $mtime,my $ctime,my $blksize,my $blocks) 
> =sat($f);
>         if ($blocks * 512 < $size) {
>                 print "\t$f => SZ: $size BLSZ: $blksize BLKS: $blocks 
> \n";
>                 print "\t" . -s $f;
>                 print "\n";
>         }
> }
> find(\&process_file,("/home/sparse-files"));
> 
> The output is:
> 
> # perl check.pl
>         /home/sparse-files/sparsefile => SZ: 10485760 BLSZ: 16384  
> BLKS: 64
>         10485760
> 
> Thanks.

Reply via email to