I am trying to fetch similar results to a Document in the index. The problem
are myriad of irrelevant hits the score of which is less than 1 percent. I
was thinking to write this class in order to omit these results. I can't use
TopDoc because the number of *really* similar results can be known a
As to point <2>, the only way I was able to deal with this was by
using a TopDocs, which does have a max score. But in that case,
I don't believe you can limit the number of hits examined.
I've just got to ask... Why do you (jafarim) want to fiddle with the
threshold? How is this going to benefi
Be aware that
score thresholds don't work well in general since scores aren't really
comparable from one query to another.
What is I normalize the scores in such a manner that they become between 0
and 1?
--jaf
On 4/22/07, jafarim <[EMAIL PROTECTED]> wrote:
I am trying to implement some TopScoreHitCollector class; a kind of
TopDocCollector which collects the documents the score of which is higher
than a threshold. The threshold will be configurable in the constructor of
the class. There is seemingly a d
On 4/22/07, jafarim <[EMAIL PROTECTED]> wrote:
> Be aware that
> score thresholds don't work well in general since scores aren't really
> comparable from one query to another.
What is I normalize the scores in such a manner that they become between 0
and 1?
Two issues with that:
1) You never
Hi list.
I am trying to implement some TopScoreHitCollector class; a kind of
TopDocCollector which collects the documents the score of which is higher
than a threshold. The threshold will be configurable in the constructor of
the class. There is seemingly a document starvation about TopDocCollecto