"Karl Koch" <[EMAIL PROTECTED]> wrote:
> For the documents Lucene employs
> its norm_d_t which is explained as:
>
> norm_d_t : square root of number of tokens in d in the same field as t
Actually (by default) it is:
1 / sqrt(#tokens in d with same field as t)
> basically just the square root
Hello Karl,
I’m very interested in the details of Lucene’s scoring as well.
Karl Koch wrote:
For this reason, I do not understand why Lucene (in version 1.2) normalises the query(!) with
norm_q : sqrt(sum_t((tf_q*idf_t)^2))
which is also called cosine normalisation. This is a technique that
etreff: Re: Re: Questions about Lucene scoring (was: Lucene 1.2 - scoring
formula needed)
> > Well it doesn't since there is not justification of why it is the
> > way it is. Its like saying, here is that car with 5 weels... enjoy
> driving.
>
> > > - I think the e
> Well it doesn't since there is not justification of why it is the
> way it is. Its like saying, here is that car with 5 weels... enjoy
driving.
> > - I think the explanations there would also answer at least some of
your
> > questions.
I hoped it would answer *some* of the questions... (not al
.apache.org
Betreff: Re: Questions about Lucene scoring (was: Lucene 1.2 - scoring formula
needed)
> [EMAIL PROTECTED] wrote:
> > According to these sources, the Lucene scoring formula in version 1.2
> is:
> >
> > score(q,d) = sum_t(tf_q * idf_t / norm_q * tf_d * idf_t /