Hi Grant and Jose,
just to give some more details, as Jose said avg_length is precalculated
at indexing time using an specific Similarity class. Basically this can
be done through the lengthNorm method, for each document and field the
total length is stored, when the indexing process is finish
Hi Ivan,
You shouldn't set the BM25Similarity for indexing or searching.
Please try removing the lines:
writer.setSimilarity(new BM25Similarity());
searcher.setSimilarity(sim);
Please let us/me know if you improve your results with these changes.
Robert Muir escribió:
Hi Ivan, I've seen
t;> Date: Tuesday, February 16, 2010, 11:36 AM
>> yes Ivan, if possible please report
>> back any findings you can on the
>> experiments you are doing!
>>
>> On Tue, Feb 16, 2010 at 11:22 AM, Joaquin Perez Iglesias
>> <
>> joaquin.pe...@lsi.uned.es&
t;
>>
>>
>> --- On Tue, 2/16/10, Robert Muir wrote:
>>
>>> From: Robert Muir
>>> Subject: Re: BM25 Scoring Patch
>>> To: java-user@lucene.apache.org
>>> Date: Tuesday, February 16, 2010, 11:36 AM
>>> yes Ivan, if possible p
t; Note: I have no bias against BM-25, but its definitely a myth to say there
> is a single retrieval formula that is the 'best' across the board.
>
>
> On Tue, Feb 16, 2010 at 1:53 PM, JOAQUIN PEREZ IGLESIAS <
> joaquin.pe...@lsi.uned.es> wrote:
>
>> By the w
best if we
> can
> support other models also!
>
> Finally I think there is something to be said for Lucene's default
> retrieval
> model, which in my (non-english) findings across the board isn't terrible
> at
> all... then again I am working with languages