Custcode+ship2 would be unique.

Searching just on the name wouldn't help because a lot of these
companies are fly by night and are bought and sold routinely so the
names change frequently. 

A relational database would be a lot easier however it would not expand
my understanding of Perl :)

-----Original Message-----
From: Kipp, James [mailto:[EMAIL PROTECTED]] 
Sent: Tuesday, February 11, 2003 10:25 AM
To: '[EMAIL PROTECTED]'; 'Perl'
Subject: RE: Finding Duplicates.


> I have to find duplicate customers in are customer file (around 60,000
> customers). The file has been exported into a pipe delimited file.
> 
> CustCode|Ship2Code|Name|Addr1|Addr2|City|State|ZipCode|Phone|F
> ax|Country
 
> The problem is the duplicates can be misspelled meaning you can't just
> do an exact search. My thinking was a couple of passes. Phone Numbers,

> Addresses, then address digits & City.

So you are looking for duplicates on any field? Wouldn't you just be
looking for a dup on the Name? Shouldn't the CustCode field be unique? 
Has your company thought about importing these files into a relational
database. Would be much faster and easier.

 


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to