Re: max_score(multi_valued_field) function?

Chris Hostetter Tue, 02 May 2006 15:53:15 -0700

: yes - i guess this is more or less what i mean. an example are the two
: documents:
:
: 1 - with the titles:
: "http"
: "hypertext transfer protocol"
:
: 2 - with the title:
: "http tunnel"
:
: when i use multi-valued fields and do a search on "http" the title
: score on the second document is higher as there is a match and the
: length is shorter. as the first title of the first document would be a
: perfect match this one should get the higher score instead.
:
: disabling the length normalization sounds good - while it may not help
: to find the more relevant title at least it won't give a bad score to
: good titles.


something you could do to gain back the basic idea of length normalization
is to 1) put artificial tokens at the begining and end of each title; 2)
use a high positionIncrimentGap; 3) at query time make all of your queries
Phrase/Span queries that include the artifical begin/end tokens with slop
values 1 less then your positionIncrimentGap.

ie, if you want to search for "http" search for "BEGIN http END"~100

Short titles will get better scores because the begin/end tokens will be
closer together.

It doesn't take care of your max concern though ... a document with the
titles "http clients" and "http 1.1 clients" will still get a higher score
by default then a document with the single title "http"



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: max_score(multi_valued_field) function?

Reply via email to