Re: Using Lucene to find duplicate/similar names

2008-04-16 Thread eks dev
Lucene helps a lot ther, this is nice inverted index lib :) - Original Message > From: Andy DePue <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent: Wednesday, 16 April, 2008 7:10:42 PM > Subject: Re: Using Lucene to find duplicate/similar names > &g

Re: Using Lucene to find duplicate/similar names

2008-04-16 Thread Andy DePue
Thanks for the pointer. I found the thread, and there is certainly some interesting information there. I'd like to stick to what Lucene has available today, mainly because I lack the time to implement anything more than that. I originally thought Levenshtein, but then realized that Lucene wo

Re: Using Lucene to find duplicate/similar names

2008-04-16 Thread Grant Ingersoll
I believe there were some posts on this about a year ago. Try searching in the archives for duplicate names, as well as "record linkage" or any other various synonyms that you can think of. The short answer is Lucene is reasonable to attempt this with, but you may need some help. The lon

Using Lucene to find duplicate/similar names

2008-04-16 Thread Andy DePue
I'm new to Lucene, and would like to use it to find duplicate (or similar) names in a contact list. Is Lucene a good fit? We have a form where a user enters a company or person's name, and we want the system to warn them if there is already a company or person entered with the same or similar n