I want to know if doing something like what is in the code below would
be expensive or for some other reason a bad choice.

There is more code, that either feeds the `find()' function or further
processes the results of the `find()' part.  The code is not in
finished form, or tested, but more to show what I'm trying to do.

There are, probably unnecessary, comments to try to show intent.

I want to be able to pass in a regex to find specific directories and
a regex to find things in the text of numeric named files in those
directories.

The passing part will probably be done with getopts standard.  Or just
a shift of two expected args.  That part is not what I'm asking about.

I'm more concerned with how the code would plow through a directory
hierarchy. 

This would be in a directory hierarchy that would contain many levels
such as a News hierarchy, where each segment of a newsgroup name is a
level that may contain many branches and possibly many thousands of single
messages in each level and branch.

A place where you might want to pass in the regex `linux\.'  to search
only the newsgroups below linux in the hierarchy, for the text_regex
that might be in the files there.

The idea being to allow you to focus a search without having to know
the exact name of the newsgroup[s]. You would at least be searching a
group with the string `linux.' in it.

So what I'm curious about is if it would be good to `next' out if the
File::Find::dir does not contain linux\.

Like: 
  next if(! $File::Find::dir =~ /$dir_rgx/);

or
Like I've done in he code below.  Just let the dir_rgx be a selector
and not worry about pulling the next line immediately.

I've thought about using `stat' to allow only directories into the
first directory based test as a further way to help focus things.  But
not sure any of this will help speed things up.

Not really sure, even if File::Find is the best way to do this.

Or probably the most likely, still another way of doing this that will
be faster or better coding.

The search is bound to be a bit slow but here is a place where coding
for speed might really make a difference.

-------        ---------       ---=---       ---------      -------- 
  use strict;
  use warnings;
  use File::Find;
  
  
  [...]
  
  find(
    sub {
      ## if we have a directory name that matches
      if($File::Find::dir =~ /$dir_rgx/){
        ## if that directory has files with all numeric names
        if(/^\d+$/){
          ## Open the files and search for a regex in the text
          open($fh,"< $File::Find::name")
                or die "Can't open $File::Find::name: $!";
          while(<$fd>){
            if(/$text_rgx/){
              print, $_;
            }
        close($fh);
        }
      }
    }
  )

  [...]


-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to