Re: Question about lengthNorm(numTerms)

jiangwen jiang Tue, 18 Jun 2013 05:44:47 -0700

I got it, thanks, Jack

2013/6/18 Jack Krupansky <[email protected]>


>   The length normalization gets compressed down to a single byte “norm”,
> stored in the “.nrm” files.
>
> See:
> norm(t,d)
>
> http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html
>
> -- Jack Krupansky
>
>  *From:* jiangwen jiang <[email protected]>
> *Sent:* Tuesday, June 18, 2013 12:35 AM
> *To:* [email protected]
> *Subject:* Question about lengthNorm(numTerms)
>
> Hi, guys:
>
> Is it suitable to send question in this mail list? There's a question
> about numTerms.
>
> http://www.lucenetutorial.com/advanced-topics/scoring.html, this website
> describes Lucene scoring.
>
> *4. lengthNorm*
> Implementation: 1/sqrt(numTerms)
> Implication: a term matched in fields with less terms have a higher score
> Rationale: a term in a field with less terms is more important than one with 
> more
>
>
> numTerms mentioned here, I think it means number of terms in field per 
> document. But the Lucene
>
> file format page doesn't mentioned it.
>
> http://lucene.apache.org/core/3_6_2/fileformats.html
>
> Does the numTerms really exists in Lucene index, if yes, how to get it?
>
>
> Regards
>
>

Re: Question about lengthNorm(numTerms)

Reply via email to