Uri Guttman wrote: >>>>>> "SB" == Steve Bertrand <st...@ibctech.ca> writes: > > SB> Perhaps there is a Perl way to do it, but otherwise, for 250GB of data, > SB> research dump/restore, and test it out (after making a backup). > > SB> imho, you shouldn't use another layer of abstraction for managing such a > SB> large volume of data, unless you are attempting to create some sort of > SB> index for it. > > i want to back up steve here. no way perl will ever handle that much > data in anything like the time a dedicated dump/rsync/etc could > do. those are optimized and written in c just for that job. perl would > be massively slower. it should be easy enough to just benchmark perl's > File::Copy vs unix cp on a large file. multiply that by that many files > and you will easily see the problem here.
After I received Uri's post to the list, I briefly removed myself from what I was doing, and wrote the following code (s/ode/rap). I wanted to test it for myself. I don't remember the last time I used backtics, but I did here, just to see what would happen. The results of the bench follow the __END__. #!/usr/bin/perl use warnings; use strict; use File::Copy::Recursive qw ( dircopy ); use Benchmark qw( :all ); my $directory = './files'; my $backup = './backup'; mkdir $backup if ! -e $backup; generate_files() if ! -e $directory; my $results = timethese( 10, { 'rsync' => sub { `rsync -arc $directory $backup` }, 'perl-cp' => sub { dircopy( $directory, $backup ) }, } ); cmpthese $results; sub generate_files { mkdir $directory if ! -e $directory; my $file_size = '1m'; for my $ext ( 1..1000) { my $file_to_create = "file.${ext}"; `mkfile $file_size $directory/$file_to_create`; } } __END__ amanda# ./bench.pl Benchmark: timing 10 iterations of perl-cp, rsync... perl-cp: 418 wallclock secs ( 3.15 usr 34.51 sys + 0.00 cusr 0.01 csys = 37.66 CPU) @ 0.27/s (n=10) rsync: 493 wallclock secs ( 0.00 usr 0.00 sys + 67.80 cusr 19.63 csys = 87.44 CPU) s/iter perl-cp rsync perl-cp 3.77 -- -100% rsync 1.00e-16 3765625000000000000% -- ...from what I can tell, if I'm interpreting the results correctly, it appears as though rsync does a bit better. The data was ( as noted in the code ) 1000, 1MB files, all located within a single directory. I ran this test ranging from count 1 through count 20, and the results were essentially the same. In essence, unless testing rsync within Perl is causing mixed results, don't use Perl to back-up or copy large amounts of data, period. Steve
smime.p7s
Description: S/MIME Cryptographic Signature