On Mon, 10 Feb 2003 17:53:10 -0800, Madhu Reddy wrote: > I want find a duplicate records in a large file.... > it contains around 22 millions records.....
I also feel like Wiggins that a database is the right way to solve it. On the other hand, 22 millions isn't that big on modern computers. The only problem is that with Perl's extra overhead for each variable a hash with all unique items would become to big too handle it. One way to solve it is to use the CPAN module DB_File (In fact, it's only a hidden database - Berkeley DB) Another way is to use a module that uses less memory wasting hash structures, e.g. Tie::GHash Tie::SubstrHash The remaining part should be quite the same like described in perldoc -q duplicate. Best Wishes, Janek -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]