: yes - i guess this is more or less what i mean. an example are the two : documents: : : 1 - with the titles: : "http" : "hypertext transfer protocol" : : 2 - with the title: : "http tunnel" : : when i use multi-valued fields and do a search on "http" the title : score on the second document is higher as there is a match and the : length is shorter. as the first title of the first document would be a : perfect match this one should get the higher score instead. : : disabling the length normalization sounds good - while it may not help : to find the more relevant title at least it won't give a bad score to : good titles.
something you could do to gain back the basic idea of length normalization is to 1) put artificial tokens at the begining and end of each title; 2) use a high positionIncrimentGap; 3) at query time make all of your queries Phrase/Span queries that include the artifical begin/end tokens with slop values 1 less then your positionIncrimentGap. ie, if you want to search for "http" search for "BEGIN http END"~100 Short titles will get better scores because the begin/end tokens will be closer together. It doesn't take care of your max concern though ... a document with the titles "http clients" and "http 1.1 clients" will still get a higher score by default then a document with the single title "http" -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]