Hello I've got many documents that are potentially duplicate (merging several external systems). Any tips how to find documents that are potentially duplicate (using a variable ranking like >0.5 match)..
I can use the similarity (MoreLikeThis) method from Sandbox, but that's always comparing one document with the index. Is there a way to give back all the potential duplicate documents in the index without interating every document in the index and compare it with the other documents in the index. Thanks Marco --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]