——

At this point it would be interesting to see how this Processor would
increase the indexing performance when you have many duplicates

- when it comes to indexing performance with duplicates, there isn’t any 
difference than a new document. It’s mark as original destroyed, and new one 
replaces.  Update isn’t a real thing, and the first operation is pretty much a 
joke speed wise and the second is as fast as indexing, and solr will manage the 
segments as needed when it determines to do so.  Your best bet is to manage 
this code wise. Have an updated/created time field and when indexing only run 
on those that fits your automated schedule against such fields.  In a database 
this takes like 5 minutes to write into your indexer, and I can promise you 
will be faster than trying to use a built in solr operation to figure it out 
for you. 

If I’m wrong I would love to know, but indexing code logic will always be 
faster than relying on a built in server function for these sorts of things.  





> On Aug 4, 2022, at 6:41 PM, Vincenzo D'Amore <v.dam...@gmail.com> wrote:
> 
> 
> At this point it would be interesting to see how this Processor would
> increase the indexing performance when you have many duplicates

Reply via email to