Re: Dedup across shards

2024-09-24 Thread Markus Jelsma
Hello, We added deduplication to Solr's QueryComponent by overriding it at some places. We are using minhashes for fuzzy deduplication, but it'll work with any kind of signature field. We did this: * override createMainQuery() so we can add a parameter that controls which is the signature field,

Dedup across shards

2024-09-24 Thread Dan Rosher
Hello Everyone,We have 3 shards, with skus linked to merchants. We don't currently, but could co-locate skus for a specific merchant on the same shard with document routing, and then dedup similar skus for the same merchant. But similar skus, that should be deduped can appear for different merchant