John W.Krahn <[EMAIL PROTECTED]> writes:
(Since the topic is no longer readir I've started a new thread -hp)
[...] Snipped many helpful tips that I did understand and took note of.
Now for the part I didn't get but first restating the goal for context.
(Usage section explains general goal)
>> Purpose: To copy files from a list on disk to the directory of
>> choice. Numbering them with nonClobber names as we go (even if some
>> numbered files are already in the directory).
>> Usage: \`$myscript TARGETDIRECTORY SOURCEFILE'
>> (DIRECTORY= name of target directory)
>> NOTE: SOURCEFILE is likely to have been compiled like this:
>> grep -rl 'REGEX' SOMEDIRECTORY >SOURCEFILE (where
>> SOMEDIRECTORY is a directory full of email or posts. We are looking
>> to skim off those with REGEX placing the filenames in SOURCEFILE, and
>> copy them to another directory using \`$myscript ', so as to leave
>> mail or news directory unharmed.
>>
>> EOM
>> }
>
> If you want to incorporate the grep into the perl program then this may
> work (UNTESTED):
It works with 1 change and one caveat, The cavaet is that the file
names in $ARGV must be absolute format or the program fails. That has
something to do with File::Finds builtin of cd ing to the source
directory I think.
Johns program; with a few questions and annotations:
(stripping the normal reply (>) indicators:
#!/usr/bin/perl
use warnings;
use strict;
use File::Find;
( my $myscript = $0 ) =~ s!\A.*/!!;
@ARGV == 3 or die "usage: $myscript 'REGEX' SOMEDIRECTORY
TARGETDIRECTORY\n";
my $regex = qr/$ARGV[0]/;
my ( $SrcDir, $TrgDir ) = @ARGV[ 1, 2 ];
## [HP I would have thought it would cause perl to still treat ARGV as
## file names since its value is not zeroed out by the above
## operations, but apparently that is not a problem
## ]
my ( $OldCnt, $Num );
## [HP $Num needs to be set to zero because this section
## will eventually throw this error:
## Use of uninitialized value in numeric lt (<) at ./Krahn.pl line 20
## if it isn't
$Num = 0;
## ]
{ opendir my $dh, $TrgDir or die "Cannot open '$TrgDir' $!";
while ( my $file = readdir $dh ) {
next unless -f "$TrgDir/$file" and $file =~ /\A\d+\z/;
$Num = $file if $Num < $file;
$OldCnt++;
}
}
## [ I don't really understand the format of having the `open' and
## processing enclosed in {}. Don't think I've noticed that before..
## But the clause is nice and tidy.. thanks
## ]
my $CopyCnt;
find sub {
return unless -f and /\A\d+\z/;
open my $fh, '<:raw', $_
or die "Cannot open '$_' $!";
my $size = -s $fh;
$size == read $fh, my $data, $size
or die "Cannot read '$_' $!";
return unless $data =~ $regex;
## [HP I don't really understand how `return' operates here. Is it
## doing the same job as `next'?
## ]
$Num++;
open my $out, '>:raw', "$TrgDir/$Num"
or die "Cannot open '$TrgDir/$Num' $!";
print $out $data
or die "Cannot print to '$TrgDir/$Num' $!";
$CopyCnt++;
}, $SrcDir;
## [ I don't really understand why some of the processing is done. I
## hadn't seen `:raw' used before but apparently the :raw part is
## there to handle the possiblitiy of different line endings in the
## files. Then size is checked; apparently to ensure the size reported
## in -s is the same when `read'
## And then the data is `printed' to its new home instead of just
## being copied there.... why is that being done?
## ]
my $NewCnt;
{ opendir my $dh, $TrgDir or die "Cannot open '$TrgDir' $!";
while ( my $file = readdir $dh ) {
next unless -f "$TrgDir/$file" and $file =~ /\A\d+\z/;
$NewCnt++;
}
}
## [HP Again I don't really understand why this
## is enclosed in {} brackets. And again it is nice and tidy,
## succint.
## ]
## Report how many files were copied here
## Left as an exercise for the OP :-)
## [HP I can't complain about the difficulty of
## the task ... hehe
## ]
print "<$OldCnt> files coped from:
$SrcDir
to
$TrgDir
<$NewCnt> files now in: $TrgDir\n";
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/