On 5/6/05, macromedia wrote: > > Hello, > > I'm not sure about the login or how I should approach what I want to > achieve. I require a script that does the following. > > 1. Search a directory and all sub-directories for a certain file > extension. (Ex. .txt) > > 2. Get the results of the above search and check "inside" each file for > a string match. (Ex. "123"). > > 3. List the matches found into an external file for further processing. > > What features should I be looking at to achieve this? Anyone know of > some good examples that I could pick away at? >
I don't know if it's good (haven't tested the code!) but here's one example: ################# begin code use strict; use warnings; use File::Find; my $log_file = "log_file.txt"; open(OUT,">$log_file") or die "Couldn't open $log_file for writing: $!\n"; find(\&wanted, "dir_name"); close(OUT) or die "Couldn't close $log_file after writing: $!\n"; sub wanted { if (m/\.txt$/) { open(IN,$File::Find::name) or die "Couldn't open $File::Find::name for reading: $!\n"; while(defined(my $line=<IN>)) { if ($line =~ m/123/) { print OUT "$File::Find::name : $line"; } } close(IN) or die "Couldn't close $File::Find::name after reading: $!\n"; } } ################# end code Copy-paste this code (not including the "########" lines) into a text editor that can show you line numbers, since I'll be mentioning them in the following explanation: * lines 1-2 are a must in every Perl program you write. Don't you dare write another script without them, or someone from this list will have to whip you with a differential SCSI cable ;-) Read more about these pragmas here: http://perldoc.perl.org/strict.html http://perldoc.perl.org/warnings.html Basically they help you write better code which is (hopefully) less buggy. Get used to writing them at the top of every Perl program. * Line 3 "use's" the File::Find module, allowing us to use the "find" function that it supplies (this is used later on, in line 7). Read here about "use": http://perldoc.perl.org/functions/use.html and here about the File::Find module: http://perldoc.perl.org/File/Find.html * Line 4 defines a variable holding the name of the log file, the external one you mentioned in point (3). Feel free to change the file name :-) * Lines 5-6 are actually one Perl command (notice there is only a ";" at the end of line 6) that is built of 3 basic parts: 1. An "open" command. Read: http://perldoc.perl.org/functions/open.html for more info about this function. 2. An "or" operator. If the left side of it (the "open" function) succeeds, it will return true, and the right side will not be evaluated at all. However, if the "open" fails, it will return zero or false, and the right side will be evaluated by the Perl interpreter. 3. A "die" statement, because if the "open" fails, we want to punish the script by causing it to die a horrible death :-) You can read more about "die" here: http://perldoc.perl.org/functions/die.html * Line 7 does the actual work, calling the "wanted" subroutine ("subroutine == function" in Perl, BTW) which we wrote to do the job. It will start at the root directory we provide (again, change this to your actuall root directory, full path and all) and run the wanted sub for every file below that directory. * Lines 8-9 - once we finish doing the work, we want to close the output filehandle. It will close anyway when the script ends, but I like to keep things neat :-) Read more about "close" here: http://perldoc.perl.org/functions/close.html * Line 10 - defines the wanted subroutine. Read about Perl subroutines here: http://perldoc.perl.org/perlsub.html Line 22 is where the closing brace that matches the opening one on line 10 is. * Line 11 - Uses the "if" flow control statement, the matching operator "m//" and a simple RE inside this operator. You can read about "if" at: http://perldoc.perl.org/perlsyn.html You can read about the "m//" operator at: http://perldoc.perl.org/perlop.html You can read about regular expressions here: http://perldoc.perl.org/perlre.html Basically, what happens is that File::Find passes the current file name into the special variable "$_" (read http://perldoc.perl.org/perlvar.html) each time it calls "wanted". We match against this file name using "m//" and look for the string "a literal period followed by 't' followed by 'x' followed by 't' which is at the end of the string". * Lines 12-13 - if the file is a ".txt" file, we want to open it for reading, look at its contents for the required string and print each line containing such a string to the log file. These 2 lines take care of the "open" part. The idea is the same as lines 5-6, except for the file name and the fact that we are opening it for reading, not writing (hence no ">" part). I'll let you figure out the other details for yourself. * Line 14 - reads every line from the current file associated with the IN filehandle and saves this line to the $line variable. For each such line, it will run the code delimited by the curly braces which start at line 14 and close at line 18. Read the following pages for details: http://perldoc.perl.org/perlsyn.html (about the "while" flow control keyword) http://perldoc.perl.org/functions/defined.html (about the "defined" function) http://perldoc.perl.org/perlop.html#I-O-Operators (about the "<FILEHANDLE>" construct) * Line 15 - this time we want to match against the $line variable, not $_, so we use the binding operator "=~".Read: http://perldoc.perl.org/perlop.html for details. The RE itself is the simplest possible - a simple string * Line 16 - finally, if the current file was a ".txt" file and the current line matched the string "123", we "print" the string "$File::Find::name : $line" to the file associated with the OUT filehandle. For more about "print" read: http://perldoc.perl.org/functions/print.html HTH, -- Offer Kaye -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>