Hello

I've got many documents that are potentially duplicate (merging several 
external systems). Any tips how to find documents that are potentially 
duplicate (using a variable ranking like >0.5 match).. 

I can use the similarity (MoreLikeThis) method from Sandbox, but that's always 
comparing one document with the index. Is there a way to give back all the 
potential duplicate documents in the index without interating every document in 
the index and compare it with the other documents in the index.

Thanks
Marco


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to