Re: Comparing two files of 8million lines/rows ...

joseph Wed, 16 Aug 2006 20:31:21 -0700

<[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Hi all,


Hello,

>
> I have two database tables, one is local and one is on a WAN. They are 
> supposed
> to be in-sync but they at the moment, they are not. There are 8million+ 
> plus
> rows on this table.
>
> I tried to do SELECT EMPNO FROM EMP WHERE EMPNO NOT IN ( SELECT EMPNO FROM
> [EMAIL PROTECTED] ), leave that running for hours and all thru the night and 
> guess
> what, am still waiting for the result to come out ...
>
> So what I decided is I want to extract the records into a flat file and 
> then
> write a Perl script to skim thru each line and check whether it exists on 
> the
> other file. While there will be 8million+ lines, the file will not be big
> beacuse am only extracting one-column from the table.
>
> Does anyone have an existing Perl code that does a similar thing like this
> already? It will be much appreciated if you can send it to me and then I 
> will
> just modify it accordingly.
>
> Example logic that I have is this:
>
> FILE1:
> MICKEY
> MINNIE
> DONALD
> GOOFY
> PLUTO
>
> FILE2:
> MICKEY
> MINNIE
> DONALD
> GOOFY
> PLUTO
> BUGS-BUNNY
>

This will work, based on you given data

use strict;
use warnings;

my @file2 = qw/mickey minnie donald goofy pluto /;
my @file1 = qw /mickey minnie donald goofy pluto bunny/;
my %hash = map { $_ => undef } @file2;

while(<@file1>) {
      unless(exists $hash{$_}) {
        print $_, "\n";
        }
      }

output:
bunny

caveats:
   This will only print out the element that were present on file1 and were 
not on file2,
and i'm also a beginner.

> So search FILE1 for all line entries of FILE2 then output whoever does not 
> exist
> into FILE3. So after running the script, I should have ...
>
> FILE3:
> BUGS-BUNNY
>
> What I currently have is that I read all of FILE2's entries into an array? 
> Read
> FILE1 one line at a time using m/// and if there is no m///, print that to
> FILE3.
>
> It seems to work fine for 1000 lines of entries, but am not particularly 
> sure
> how long will that take for 8million+ rows, not even sure if I can create 
> an
> array to contain 8million+ plus rows, if I can't, then am back to doing 
> this on
> the database instead. Another option am looking at is just to read FILE1 
> one
> line at a time and do a grep "$string_to_search" FILE2 but I do not know 
> how to
> do a grep-like syntax against a file on Perl especially if the search 
> string is
> a variable.
>
> Why I prefer using a script is so am not putting loads into the database 
> not to
> mention that I can put more logic into the script than on the SQL 
> statement.
>
> Any advise or other options will be very much appreciated .... Thanks in
> advance.
>
>
welcome, HTH.

/joseph 



-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Re: Comparing two files of 8million lines/rows ...

Reply via email to