The length normalization gets compressed down to a single byte “norm”, stored in the “.nrm” files.
See: norm(t,d) http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html -- Jack Krupansky From: jiangwen jiang Sent: Tuesday, June 18, 2013 12:35 AM To: [email protected] Subject: Question about lengthNorm(numTerms) Hi, guys: Is it suitable to send question in this mail list? There's a question about numTerms. http://www.lucenetutorial.com/advanced-topics/scoring.html, this website describes Lucene scoring. 4. lengthNorm Implementation: 1/sqrt(numTerms) Implication: a term matched in fields with less terms have a higher score Rationale: a term in a field with less terms is more important than one with morenumTerms mentioned here, I think it means number of terms in field per document. But the Lucenefile format page doesn't mentioned it.http://lucene.apache.org/core/3_6_2/fileformats.htmlDoes the numTerms really exists in Lucene index, if yes, how to get it?Regards
