I got it, thanks, Jack 2013/6/18 Jack Krupansky <[email protected]>
> The length normalization gets compressed down to a single byte “norm”, > stored in the “.nrm” files. > > See: > norm(t,d) > > http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html > > -- Jack Krupansky > > *From:* jiangwen jiang <[email protected]> > *Sent:* Tuesday, June 18, 2013 12:35 AM > *To:* [email protected] > *Subject:* Question about lengthNorm(numTerms) > > Hi, guys: > > Is it suitable to send question in this mail list? There's a question > about numTerms. > > http://www.lucenetutorial.com/advanced-topics/scoring.html, this website > describes Lucene scoring. > > *4. lengthNorm* > Implementation: 1/sqrt(numTerms) > Implication: a term matched in fields with less terms have a higher score > Rationale: a term in a field with less terms is more important than one with > more > > > numTerms mentioned here, I think it means number of terms in field per > document. But the Lucene > > file format page doesn't mentioned it. > > http://lucene.apache.org/core/3_6_2/fileformats.html > > Does the numTerms really exists in Lucene index, if yes, how to get it? > > > Regards > >
