Custcode+ship2 would be unique. Searching just on the name wouldn't help because a lot of these companies are fly by night and are bought and sold routinely so the names change frequently.
A relational database would be a lot easier however it would not expand my understanding of Perl :) -----Original Message----- From: Kipp, James [mailto:[EMAIL PROTECTED]] Sent: Tuesday, February 11, 2003 10:25 AM To: '[EMAIL PROTECTED]'; 'Perl' Subject: RE: Finding Duplicates. > I have to find duplicate customers in are customer file (around 60,000 > customers). The file has been exported into a pipe delimited file. > > CustCode|Ship2Code|Name|Addr1|Addr2|City|State|ZipCode|Phone|F > ax|Country > The problem is the duplicates can be misspelled meaning you can't just > do an exact search. My thinking was a couple of passes. Phone Numbers, > Addresses, then address digits & City. So you are looking for duplicates on any field? Wouldn't you just be looking for a dup on the Name? Shouldn't the CustCode field be unique? Has your company thought about importing these files into a relational database. Would be much faster and easier. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]