Alex,
if you have length normalization turned on then the length (the number
of tokens and perhaps even the distance between the tokens) of the
second document is much greater than the length of the first document.
The length is the complete number of tokens in the field, i.e. if you
add more than one field with the same name to a document these will be
concatenated. This is why the first hit is a better match.
Try the Searcher#explain method for more details:
http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Searcher.html#explain(org.apache.lucene.search.Query,%20int)
karl
26 nov 2008 kl. 20.22 skrev AlexElba:
Hello ,
I have two document in my lucene index
Document<stored/uncompressed,indexed<tagId:5117>
stored/uncompressed<tagName:Wholesale Hot Dog Stand Equipment>
stored/uncompressed,indexed,tokenized<tagKey:wholesale hot dog stand
equipment> stored/uncompressed>
Document<stored/uncompressed,indexed<tagId:11274>
stored/uncompressed<tagName:Hot Dogs>
stored/uncompressed,indexed,tokenized<tagKey:hot dog meal>
stored/uncompressed,indexed,tokenized<tagKey:hot dog restaurant>
stored/uncompressed,indexed,tokenized<tagKey:hotdog>
stored/uncompressed,indexed,tokenized<tagKey:hot dog>
stored/uncompressed,indexed,tokenized<tagKey:hot dog dining>
stored/uncompressed,indexed,tokenized<tagKey:best hotdog>
stored/uncompressed,indexed,tokenized<tagKey:cuisine hot dog>
stored/uncompressed,indexed,tokenized<tagKey:hotdog stand>
stored/uncompressed,indexed,tokenized<tagKey:hotdog restaurant>
stored/uncompressed,indexed,tokenized<tagKey:hot dog grill>
stored/uncompressed,indexed,tokenized<tagKey:hot dog cuisine>
stored/uncompressed,indexed,tokenized<tagKey:hot dog stand>
stored/uncompressed,indexed,tokenized<tagKey:hot dog menu>
stored/uncompressed,indexed,tokenized<tagKey:hot dog shop>
stored/uncompressed,indexed,tokenized<tagKey:hotdog vendor>
stored/uncompressed,indexed,tokenized<tagKey:hotdog grill>>
and I am searching for +tagKey:hot +tagKey:dog
which is exact match for 2nd document, but I am getting 1.0 score
for first
document and 0.7 for second one.
I have custom similarity where lengthNorm is (1.0 / tokenCount)
others are
some consents
why my first document is getting higher score?
--
View this message in context:
http://www.nabble.com/Scoring-issue-tp20707410p20707410.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]