Hi Boris,
Query parsing and scoring/ranking are completely separated processes
so I'd debug those problems separately.
For debugging fuzzy query, Query.rewrite() method would be a good
first step (by which you can see all unrolled terms generated by fuzzy
query).
I'm not sure about what is your pr
Hi Boris,
"Acer campestre 'Rozi'" now receives a higher score with DFISimilarity
and BM25Similarity (with tuned 'b') instead of the standard BM25.
It really iswas a scoring/normalization issue: While "Rozi" gets a
higher score, "Acer" and "campestere" received lower values and the
combined result
Yes, you can use DFISimilarity with an index constructed with
BM25Similarity. No need to reindex.
On Fri, Jun 14, 2019 at 1:05 PM Frédéric Glorieux wrote:
>
> Hi,
>
> I'm working on literature texts (French).
>
> My users are interested in relevance tweaking to have the most suggested
> texts (fo
These are great suggestions, i was going to suggest explain plan of
query, too.
i really wonder in Your case why 'Rozi' entry does not get higher score.
Is there any effect from " ' " chars?
In my case i have sort of reverse situation:
my query is maink~2 (mains was a special case where i st
Hi,
I'm working on literature texts (French).
My users are interested in relevance tweaking to have the most suggested
texts (for their taste) in top results.
Change similarity at query time is less expensive than reindex all.
I checked that BM25 needs to write “norms“ to keep document lengt
Hi Namgyu and Tomoko,
your hint towards Explanation was very helpful and I was not aware of
this feature.
I have now experimented with different scoring functions and it seems
that DFISimilarity and BM25Similarity (with lower 'b') produce results
in the direction I prefer, though not perfect for
Hi Matthias,
What similarity class are you using.
Just a guess... but possibly one reason is document (field) length
normalization. Generally speaking shorter documents would get higher
scores than longer documents. (I saw that classic TFIDF similarity
tends to give much higher scores to shorter