Re: script takes long time to run when comparing digits within strings using foreach

Shlomi Fish Fri, 27 May 2011 02:39:28 -0700

Hi eventual,

On Friday 27 May 2011 11:18:01 eventual wrote:
> Hi,
> I have an array , @datas, and each element within @datas is a string that's
> made up of 6 digits with spaces in between like this “1 2 3 4 5 6”, so the
> array look like this @datas = ('1 2 3 4 5 6', '1 2 9 10 11 12', '1 2 3 4 5
> 8', '1 2 3 4 5 9' , '6 7 8 9 10 11'); Now I wish to compare each element
> of @datas with the rest of the elements in @datas in such a way that if 5
> of the digits match, to take note of the matching indices,  and so the
> script I wrote is appended below. However, the script below takes a long
> time to run if the datas at @datas are huge( eg 30,000 elements). I then
> wonder is there a way to rewrite the script so that the script can run
> faster. Thanks
>  
> ###### script below #######################
>  
> #!/usr/bin/perl
> use strict;
>  
> my @matched_location = ();
> my @datas = ('1 2 3 4 5 6', '1 2 9 10 11 12', '1 2 3 4 5 8', '1 2 3 4 5 9'
> , '6 7 8 9 10 11'); 
> my $iteration_counter = -1;
> foreach (@datas){
>    $iteration_counter++;
>    my $reference = $_;
>  
>    my $second_iteration_counter = -1;
>    my $string = '';
>    foreach (@datas){
>       $second_iteration_counter++;
>       my @individual_digits = split / /,$_;
>  
>       my $ctr = 0;
>       foreach(@individual_digits){
>           if($reference =~/^$_ | $_ | $_$/){
>               $ctr++;
>           }
>       }
>       if ($ctr >= 5){
>           $string = $string . "$second_iteration_counter ";
>       }
>    }
>    $matched_location[$iteration_counter] = $string;
> }
>  
> my $ctr = -1;
> foreach(@matched_location){
>     $ctr++;
>     print "Index $ctr of \@matched_location = $_\n";
> }
>


First of all, you should add "use warnings;" to your code. Then you should get 
rid of the implicit $_ as loop iterator because it's easy to break. For more 
information see:

http://perl-begin.org/tutorials/bad-elements/

Other than that - you should use a better algorithm. One option would be to 
sort the integers and then use a diff/merge-like algorithm:

http://en.wikipedia.org/wiki/Merge_algorithm

A different way would be to use a hash to count the number of times each 
number occured in the two sets, and then see how many of them got a value of 2 
(indicating they are in both sets).

But at the moment, everything is very inefficient there.

Regards,

        Shlomi Fish


-- 
-----------------------------------------------------------------
Shlomi Fish       http://www.shlomifish.org/
"Star Trek: We, the Living Dead" - http://shlom.in/st-wtld

I often wonder why I hang out with so many people who are so pedantic. And
then I remember - because they are so pedantic.
-- Israeli Perl Monger

Please reply to list if it's a mailing list post - http://shlom.in/reply .

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: script takes long time to run when comparing digits within strings using foreach

Reply via email to