Re: Replication and Score Issue

2021-03-23 Thread Alessandro Benedetti
When calculating the DF (document frequency) component of a BM25 score, Apache Lucene BM25 similarity uses: org.apache.lucene.search.similarities.BM25Similarity#idfExplain(org.apache.lucene.search.CollectionStatistics, org.apache.lucene.search.TermStatistics) *Note that CollectionStatistics.docCou

Re: Replication and Score Issue

2021-03-22 Thread Dominique Bejean
Hi, If your replicas are all NRT, they both index documents. Their commit and segment merge cycles are independant and so yes, see different MaxDoc and DeletedDoc for each replicas is normal. We can expect BM25 doesn't care about deleted docs, but I can't answer with certainty. Regards. Dominiq

Replication and Score Issue

2021-03-21 Thread Jae Joo
solr 8.6.2. I have a collection with 48 shards and 30 seconds softcommit and 2 minutes hardcommit (opensearcher=false) I found that two replicas have exactly Num Docs, but different Max Doc and Deleted Decs. While I am running the same query many times, I am seeing the max score is different. Th