Re: One last question

david Tue, 17 Sep 2002 12:36:37 -0700

Anthony Saffer wrote:

> Hello AGAIN,
> 
> I have one final question that I think will set me free from this coding
> haze I've been in all day. Please look at the code below. Here is the idea
> I am trying to implement:
> 
> I have a text file with a list of about 56,000 filenames. Only the
> filenames are in this file.  I have another 30,000 or so .cfm and .htm
> files. I want to use File::Find to cycle through EVERY file in EVERY
> directory line by line (about 2 million lines in all). Evertime it comes
> across a reference to one of the 56,000 files I have in the list in the
> htm or cfm file it needs to replace it with a lowercase version of it. Not
> touching ANYTHING else.
> 
> I know it's going to take regular expressions. This is where I am totally
> lost. Could somone give me some hints. Please don't provide me with ready
> made code as I won't really learn that way. But an idea on what I need to
> do would be very helpful.
> 
> Thanks!
> Anthony
> 
> CODE BELOW:
> #!/usr/bin/perl -w
> use strict;
> use File::Find;
> 
> sub process_files{
>  open($FH, "< $_") or die("Error! Couldn't open $_ for reading!! Program
>  aborting.\n"); open($MATCH, "< /home/losttre/match.txt") or die("Error!
>  Couldn't open $MATCH for reading!\n"); open($TEMP, "./temp.dat") or die
>  ("Couldn't open temp file! Aborting\n");
>  
>  @MATCH = <MATCH>;
>  @fcontents = <FH>;
>  
>  foreach $lineitem (@MATCH){
>   foreach $lineitem2 (@fcontents){
>    if($lineitem == i/$lineitem2/){
>         #I ASSUME THIS IS WHERE MY MATCH WOULD HAPPEN AND I NEED TO
>         #REPLACE THE STRING
>     }
> }
> 
> NOTE: Yes, I am aware there are a lot of syntax and other problems with
> this code. I can probably correct those but I am totally lost on the
> matching.


searching a large array is time inefficient. you should consider using a 
hash instead. assume you have your 56,000 filenames in the 'master.txt' 
file and you want to search the '/searchable' directory (and all it's 
subdirectories) for a match:

#!/usr/bin/perl -w
use strict;
use File::Find;

my %master;

#-- first load the master.txt into a hash:
#--
open(MASTER,'master.txt') || die $!;
while(<MASTER>){
        chomp;
        $master{$_} = 1;        
}
close(MASTER);

#-- now traverse the '/searhable' directory for a match
#--
find(\&process,'/searchable');

sub process{

        #-- assume the filenames in master.txt is only relative
        #-- if that's not the case, $Find::File::dir can help prefix $_
        next unless(exists $master{$_});

        #-- found a match:
        #--
        #-- $_: is the match filename
        #-- $Find::File::dir is where $_ is resides in
        #-- $Find::File::name is the full path
        #--
        #-- do whatever you want to do such as doing a rename()
        #-- like you plan
}

__END__

this way, your script will spend most of it's time traversing the 
directories instead of finding matches within the directories.

david

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: One last question

Reply via email to