Re: matching values of one hash to another

Harry Putnam Thu, 29 Apr 2010 19:09:27 -0700

Jim Gibson <jimsgib...@gmail.com> writes:

> The general advice is not to worry about execution time until it becomes an
> issue. In other words, just get your program to do what you want it to do,
> then try to speed it up only if it is taking too long.


With that in mind... I'm back to spinning through the second hash for
every line of the first.

But it does print the information I wanted.  I have NOT yet done any
checking to see if there are problems with the matching.

This script seems to come at least close to listing out all the
matched files from hash2 that match a filename in hash1.  But I'm
still missing any better way to find all the matches, than looping
through hash2 for each match.

I tried a few different formulations with `if (exists [...]' but I
didn't see how to get multiple matches.

I'm pretty sure its because of my failure to understand John Ks
technique for finding them.

Maybe some of you can see how to improve this code.  I only included a
few selections of sample output since it is quite extensive.

Just for the heck of it, I did make some effort to get an idea of
run time, using the unix `time' command:

time hpex3 ./dir1 ./dir2
real    0m5.386s
user    0m2.108s
sys     0m0.051s

That would be looping through a 3000+ line hash
647 times.

I'm using two medium `git' repos as test directories.
I knew in these repos there would be lots of odd matching to figure
out how to process.

-------        ---------       ---=---       ---------      -------- 

#!/usr/local/bin/perl

use strict;
use warnings;

use File::Find;

( my ( $r1, $r2 ) = @ARGV ) == 2
    or die "usage: $0 dir1 dir2\n";

## Haven't been able to follow how to do the recursions with 
## only one hash name as John K did with %data
## Don't think there is much chance of Uri liking the names
## so sticking with those posted... in the hope it will be
## less confusing.

my %r1h;
my %r2h;

## create a hash of all type -f  $File::Find::names and names ($_)
## They will be of the form `$File::Find::name => $_'

find sub {
    return unless -f;
    $r1h{ $File::Find::name } = $_ ;
    }, $r1;

## create a hash of all type -f  $File::Find::names and names ($_)
## They will be of the form `$File::Find::name => $_'

find sub {
    return unless -f;
    $r2h{ $File::Find::name } = $_; 
    }, $r2;


my $r1hfull;
my $r1hend;
my @matches;

## I wasn't able to follow how to find ALL the matches to
## (hash %r1h $r1hend that exist in hash %r2h

while (($r1hfull,$r1hend) = each(%r1h)) {
   push @matches, $r1hfull;

## I see there is rampant duplication of work here... but 
## Haven't yet seen how to prevent it.
## If we find an endname in %r1h matching an endname in %r2h
## Print the fullname from %r1h, and all fullnames of the matches 
## from %r2h

foreach my $key (keys %r2h) {
    if ($r2h{$key} eq $r1hend) {
#       print "$r2h{$key} MATCHES $r1end\n";
       push @matches, $key;
    }
  }
  if(@matches > 1){
    print "We have <$#matches> matches:\n";
    for(my $ii = 0;$ii <= $#matches;$ii++){
       if($ii == 0){
         print "     $matches[$ii]\n";
       }else{
         print "         $matches[$ii]\n";
       }      
    }
    print "-------       -------       ---=---       -------       -------\n";  
   
  }
  @matches = ();
}

-------        ---------       ---=---       ---------      -------- 
Sample output:

[...]

We have <1> matches:
     ./dir1/texi/sasl.texi
         ./dir2/emacs/doc/misc/sasl.texi
-------       -------       ---=---       -------       -------
We have <3> matches:
     ./dir1/etc/images/save.xpm
         ./dir2/emacs/etc/images/low-color/save.xpm
         ./dir2/emacs/etc/images/save.xpm
         ./dir2/emacs/etc/images/mail/save.xpm
-------       -------       ---=---       -------       -------
,----
| Note this kind of stuff below (git files) will be filtered out of
| a more ambitious attempt at processing the top level directories
`----
We have <12> matches:
     ./dir1/.gitignore
         ./dir2/emacs/.gitignore
         ./dir2/emacs/nt/.gitignore
         ./dir2/emacs/lib-src/.gitignore
         ./dir2/emacs/info/.gitignore
         ./dir2/emacs/etc/.gitignore
         ./dir2/emacs/leim/.gitignore
         ./dir2/emacs/lisp/eshell/.gitignore
         ./dir2/emacs/src/.gitignore
         ./dir2/emacs/admin/unidata/.gitignore
         ./dir2/emacs/leim/quail/.gitignore
         ./dir2/emacs/lisp/emacs-lisp/.gitignore
         ./dir2/emacs/lisp/.gitignore
-------       -------       ---=---       -------       -------
[...]



-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: matching values of one hash to another

Reply via email to