Re: idf in scores

Yonik Seeley Tue, 07 Nov 2006 07:14:35 -0800

On 11/7/06, Antony Bowesman <[EMAIL PROTECTED]> wrote:

I've been trying to understand how idf is arrived at from a query.  I have a
single Document with 9 fields.  One field "subject" has the phrase "RFC2822 -
Internet Message Format" and a second "body" has the contents of rfc2822.


The other fields contain additional meta data.  If I search for subject:message
I get the following explanation.

0.15342641 = fieldWeight(subject:message in 0), product of:
   1.0 = tf(termFreq(subject:message)=1)
   0.30685282 = idf(docFreq=1)
   0.5 = fieldNorm(field=subject, doc=0)

why does the idf get that value?


idf is dependent only on the corpus, not on the individual document.
The formula is here:
http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarity.html
1+log(1/2) = 0.30685282

-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: idf in scores

Reply via email to