subject:"RE\: Finding 'probable' duplicate records"

Re: Finding 'probable' duplicate records

2001-12-07 Thread iain truskett

* Carl Rogers ([EMAIL PROTECTED]) [08 Dec 2001 01:54]: [...] > I was hoping that I could find a way to say 'Compare two strings (the > fields within the strings aren't important). If string B has 17 common > characters out of 20 in string A, you might want to consider that a > match'. See the St

Re: Finding 'probable' duplicate records

2001-12-07 Thread Frank

perhaps: use Digest::MD5 or check out the Guttman Rosler transform: http://raleigh.pm.org/sorting.html -- Frank Booth - Consultant Parasol Solutions Limited. (www.parasolsolutions.com) -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

RE: Finding 'probable' duplicate records

2001-12-07 Thread Carl Rogers

Hey John; Thanks for the help At 08:44 AM 12/7/2001 -0500, [EMAIL PROTECTED] wrote: >Carl, > > I don't have a lot of Perl-specific advice, but if it's possible to >dependably parse each line into the component fields (last name, first name, >street address, etc.), you could apply some intel

RE: Finding 'probable' duplicate records

2001-12-07 Thread John . Brooking

Carl, I don't have a lot of Perl-specific advice, but if it's possible to dependably parse each line into the component fields (last name, first name, street address, etc.), you could apply some intelligent guesses using the various fields. If this is possible, here's what worked pretty well fo