Re: Comparing two files

Ondrej Par Wed, 06 Jun 2001 09:37:44 -0700
Hi,

one approach is to sort that files first, and work with sorted files - you 
then need to read them only once.

Second approach is to load smaller file into memory - to create a has with 
something like 
while (<FILE1>) { chomp; $found1{$_}++; }
and then read second file and compare it:
while (<FILE2>) { 
        chomp;
        if ($found1{$_}) {
                $found_both{$_}++;
        } else {
                print "$_\n";
        }
}
foreach (keys %found1) {
        print "$_\n" unless $found_both{$_};
}
but this will consume lot of memory.


On Wednesday 06 June 2001 17:46, Steve Whittle wrote:
> Hi,
>
> I'm trying to write a script that removes duplicates between two files and
> writes the unique values to a new file. For example, have one file with the
> following file 1:
>
> red
> green
> blue
> black
> grey
>
> and another file 2:
>
> black
> red
>
> and I want to create a new file that contains:
>
> green
> blue
> grey
>
> I have written a script that takes each entry in file 1 and then reads
> through file 2 to see if it exists there, if not, it writes it to a new
> file. If there is a duplicate, nothing is written to the new file. The real
> file 1 I'm dealing with has more than 2 million rows and the real file 2
> has more than 100,000 rows so I don't think my method is very efficient.
> I've looked through the web and perl references and can't find an easier
> way. Am I missing something? Any ideas?
>
> Thanks,
>
> Steve Whittle

-- 
Ondrej Par
Internet Securities
Software Engineer
e-mail: [EMAIL PROTECTED]
Phone: +420 2 222 543 45 ext. 112
Re: Comparing two files

Reply via email to