We are running a large sharded Lucene-based application. Our configuration supports near real-time updates, by incrementally Updating documents (using delete then add) on the shards. Every shard is replicated to several machines in order to improve performance. We replicate the shard by sending the same deletion and addition commands to all the replicas, Where they may be performed in a different order. (We delete a set of documents, say 1000 at a time, Then add them one-by-one semi-asynchronously). Lately we have noticed a subtle difference in query scores across different replicas of the same shard. Further investigation showed that the only noticeable difference between the replicas was the index directory structure: 1. Different replicas have different sets of segments - most segment files are the same, but some are different. 2. The numbers of deleted documents are different between two replicas of the same shard. Is this a known behavior of Java Lucene? How can we change this behavior? We want different replicas returning the exact same score per query hits. (We would rather not optimize the index as we believe this will harm performance.)
TIA, Yuval and Ophir