Sorry, I thought that you wanted to maintain the true value rather
than the approximated value. I am not entirely sure, but I think the
approximation arises due to rounding and low-precision storage of
these values in the index. You might be able to reverse engineer it by
looking at "Norms," which
Thank you for your answer, but please could you explain this idea in
detail as I cannot see how this would help solving my problem?
For example, I got the indexed Wikipedia Article "Alan Smithee" with a
document length of 756, which also is used when calculating the average
document length. Bu
You could append an EOF token to every indexed text, and then iterate
over Terms to get the positions of those tokens?
On Tue, Jun 2, 2020 at 11:50 AM Moritz Staudinger
wrote:
>
> Hello,
>
> I am not sure if I am at the right place here, but I got a question about
> the approximation my Lucene im
Hello,
I am not sure if I am at the right place here, but I got a question about
the approximation my Lucene implementation does.
I am trying to calculate the same scores Lucenes BM25Similiarity calculates,
but I found out that Lucene only approximates the length of documents for
scoring but us