Thanks a lot for your quick answer.
I modified the idf formula as you suggested, and got a small improvement.
Then I added coord() to scoring and the boost was pretty big :)
http://lucene.472066.n3.nabble.com/file/n873240/prcurve_bm25.png
These tests are done on the ohsumed train corpus(i used
I experienced this problem too, when implementing a prototype of this
similarity for LUCENE-2392, i noticed it consistently provided better
relevance than this patch. (you can see the prototype, i uploaded it to
LUCENE-2091).
After lots of discussion and debugging with José Ramón Pérez Agüera, I
Hy,
I did some tests and i keep getting low scores for bm25 on ORP collections.
The implementation is from this patch:
https://issues.apache.org/jira/browse/LUCENE-2091
https://issues.apache.org/jira/browse/LUCENE-2091
Is that normal?
I got the following results using StandardAnalyzer:
http:
Thanks, that worked:)
--
View this message in context:
http://lucene.472066.n3.nabble.com/Precision-recall-curve-with-contrib-benchmark-quality-tp82p845518.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
Personally I use the generated submission.txt, and run it thru
trec_eval to get all numbers.
by default, trec_eval will dump ircl_prn. values, and you could
plug them in openoffice.
I prefer to use trec_eval as the results from the benchmark summary
often differ with trec_eval.
it would be gr