Regarding "Negative or zero value for fieldNorm", I don't see any
negative fieldNorms here... just very small positive ones?

Anyway the fieldNorm is the product of the lengthNorm and the
index-time boost of the field (which is itself the product of the
index time boost on the document and the index time boost of all
instances of that field).  Index time boosts default to "1" though, so
they have no effect unless something has explicitly set a boost.

-Yonik
http://www.lucidimagination.com



On Wed, Nov 3, 2010 at 2:30 PM, Markus Jelsma
<[email protected]> wrote:
> Hi all,
>
> I've got some puzzling issue here. During tests i noticed a document at the
> bottom of the results where it should not be. I query using DisMax on title
> and content field and have a boost on title using qf. Out of 30 results, only
> two documents also have the term in the title.
>
> Using debugQuery and fl=*,score i quickly noticed large negative maxScore of
> the complete resultset and a portion of the resultset where scores sum up to
> zero because of a product with 0 (fieldNorm).
>
> See below for debug output for a result with score = 0:
>
> 0.0 = (MATCH) sum of:
>  0.0 = (MATCH) max of:
>    0.0 = (MATCH) weight(content:kunstgrasveld in 7), product of:
>      0.75658196 = queryWeight(content:kunstgrasveld), product of:
>        6.6516633 = idf(docFreq=33, maxDocs=9682)
>        0.113743275 = queryNorm
>      0.0 = (MATCH) fieldWeight(content:kunstgrasveld in 7), product of:
>        2.236068 = tf(termFreq(content:kunstgrasveld)=5)
>        6.6516633 = idf(docFreq=33, maxDocs=9682)
>        0.0 = fieldNorm(field=content, doc=7)
>    0.0 = (MATCH) fieldWeight(title:kunstgrasveld in 7), product of:
>      1.0 = tf(termFreq(title:kunstgrasveld)=1)
>      8.791729 = idf(docFreq=3, maxDocs=9682)
>      0.0 = fieldNorm(field=title, doc=7)
>
> And one with a negative score:
>
> 3.0716116E-4 = (MATCH) sum of:
>  3.0716116E-4 = (MATCH) max of:
>    3.0716116E-4 = (MATCH) weight(content:kunstgrasveld in 1462), product of:
>      0.75658196 = queryWeight(content:kunstgrasveld), product of:
>        6.6516633 = idf(docFreq=33, maxDocs=9682)
>        0.113743275 = queryNorm
>      4.059853E-4 = (MATCH) fieldWeight(content:kunstgrasveld in 1462), product
> of:
>        1.0 = tf(termFreq(content:kunstgrasveld)=1)
>        6.6516633 = idf(docFreq=33, maxDocs=9682)
>        6.1035156E-5 = fieldNorm(field=content, doc=1462)
>
> There are no funky issues with term analysis for the text fieldType, in fact,
> the term passes through unchanged. I don't do omitNorms, i store termVectors
> etc.
>
> Because fieldNorm = fieldBoost / sqrt(numTermsForField) i suspect my input 
> from
> Nutch is messed up. A fieldNorm can never be =< 0 for a normal positive boost
> and field boosts should not be zero or negative (correct me if i'm wrong). 
> But,
> since i can't yet figure out what field boosts Nutch sends to me i thought i'd
> drop by on this mailing list first.
>
> There are quite a few query terms that return with zero or negative scores and
> many that behave as i expect. I find it also a bit hard to comprehend why the
> docs with negative score rank higher in the result set than documents with
> zero score. Sorting defaults to score DESC,  but this is perhaps another
> issue.
>
> Anyway, the test runs on a Solr 1.4.1 instance with Java 6 under the hood.
> Help or directions are appreciated =)
>
> Cheers,
>
> --
> Markus Jelsma - CTO - Openindex
> http://www.linkedin.com/in/markus17
> 050-8536600 / 06-50258350
>

Reply via email to