Re: Lucene computes an automatic boost based on the number of tokens in the field (shorter fields have a higher boost) ?

Paul Taylor Tue, 12 Jan 2010 05:17:35 -0800

Benjamin Heilbrunn wrote:

This is because matches in short fields (few terms) als typically more
pregnant, than matches in long fields (much terms).


Imagine the case with two fields named "title" and "content"
representing the title and the content of books.
If you match three search terms in a five terms title this is a better
hit than if you match those three search terms in the content of the
book.

The length normalization factor is calculated by your Similarity
implementation in the method
public float lengthNorm(String fieldName, int numTokens)

Does that help you?

Yes, thanks it does I was just getting it, is it base purely on matchinga field with less terms rather than the percentage of terms in a fieldmatched.i.e If you match three search terms in a five terms field would this bebetter then if you match those four search terms in a six term field.



do you know the answer to my second post.

i.e what does default lengthNorm return for a single term field,(compared to if have no NO NORM whereby assume value 1.0)


Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Lucene computes an automatic boost based on the number of tokens in the field (shorter fields have a higher boost) ?

Reply via email to