The length normalization gets compressed down to a single byte “norm”, stored 
in the “.nrm” files.

See:
norm(t,d)
http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html

-- Jack Krupansky

From: jiangwen jiang 
Sent: Tuesday, June 18, 2013 12:35 AM
To: [email protected] 
Subject: Question about lengthNorm(numTerms)

Hi, guys: 

Is it suitable to send question in this mail list? There's a question about 
numTerms.

http://www.lucenetutorial.com/advanced-topics/scoring.html, this website 
describes Lucene scoring.
4. lengthNorm
Implementation: 1/sqrt(numTerms)
Implication: a term matched in fields with less terms have a higher score
Rationale: a term in a field with less terms is more important than one with 
morenumTerms mentioned here, I think it means number of terms in field per 
document. But the Lucenefile format page doesn't mentioned 
it.http://lucene.apache.org/core/3_6_2/fileformats.htmlDoes the numTerms really 
exists in Lucene index, if yes, how to get it?Regards

Reply via email to