Re: Precision-recall curve with /contrib/benchmark/quality

2010-06-05 Thread calin014
Thanks a lot for your quick answer. I modified the idf formula as you suggested, and got a small improvement. Then I added coord() to scoring and the boost was pretty big :) http://lucene.472066.n3.nabble.com/file/n873240/prcurve_bm25.png These tests are done on the ohsumed train corpus(i used

Re: Precision-recall curve with /contrib/benchmark/quality

2010-06-05 Thread Robert Muir
I experienced this problem too, when implementing a prototype of this similarity for LUCENE-2392, i noticed it consistently provided better relevance than this patch. (you can see the prototype, i uploaded it to LUCENE-2091). After lots of discussion and debugging with José Ramón Pérez Agüera, I

Re: Precision-recall curve with /contrib/benchmark/quality

2010-06-05 Thread calin014
Hy, I did some tests and i keep getting low scores for bm25 on ORP collections. The implementation is from this patch: https://issues.apache.org/jira/browse/LUCENE-2091 https://issues.apache.org/jira/browse/LUCENE-2091 Is that normal? I got the following results using StandardAnalyzer: http:

Re: Precision-recall curve with /contrib/benchmark/quality

2010-05-26 Thread calin014
Thanks, that worked:) -- View this message in context: http://lucene.472066.n3.nabble.com/Precision-recall-curve-with-contrib-benchmark-quality-tp82p845518.html Sent from the Lucene - Java Users mailing list archive at Nabble.com.

Re: Precision-recall curve with /contrib/benchmark/quality

2010-05-26 Thread Robert Muir
Personally I use the generated submission.txt, and run it thru trec_eval to get all numbers. by default, trec_eval will dump ircl_prn. values, and you could plug them in openoffice. I prefer to use trec_eval as the results from the benchmark summary often differ with trec_eval. it would be gr