I have a problem that I thought would be perfect for Perl, except that I
seem to be using all my system resources to run it.  Of course this
probably means I'm doing it the wrong way...

 

The problem:

We have a backup server that is missing records from the production
server for a particular table.  We know that it should have sequential
records and that it is missing some records.  We want to get a sense of
the number of records missing.  So, we know the problem started around
the beginning of March at id 70,000,000 (rounded for convenience).
Currently we are at 79,000,000.  So, I dumped to a file all the ids
between 70,000,000 and 79,000,000 (commas inserted here for
readability).  I need to figure out what numbers are missing.  The way
that seemed easiest to me was to create two arrays.  One with every
number between 70 and 79 million, the other with every number in our
dump file.  Then compare them as illustrated in the Perl Cookbook using
a hash.

The simple script I came up with works fine with a test file of just 10
records.

But, when I try to scale that to 9 million records, it doesn't work.
This is probably because it is trying to do something like what db
people call a cartesian join (every record against every record).

So, does anybody have a suggestion for a better way to do it in Perl?

 

I'll probably end up doing it in SQL if I can't come up with a Perl
solution.  (Create a second table like the first array with every number
between 70 and 79 million, and join the two tables.)  

 

Larry

[EMAIL PROTECTED]

 

script:

 

use strict;

use warnings;

 

my %seen;

my @list = ();

my @missing;

my @ids = ();

my $lis;

my $item;

 

foreach $lis (1 .. 10) {     # sample list of 10 

push(@ids, $lis);

}

 

open(DATA, "< ms_ids_test.txt")  or die "Couldn't open data file: $!\n";
# create file like below 

 

while (<DATA>) {

            chomp;

            push(@list, $_);

}

 

@[EMAIL PROTECTED] = ();

 

foreach $item (@ids) {

  push(@missing, $item) unless exists $seen{$item};

  }

  

  print scalar(@missing);

  

 

#sample file (without the pounds)

#1

#2

#3

#4

#5

#9

#10

# note missing 6,7,8

# result is 3 

 

Reply via email to