Relevancy debugging - idf score

Sjoerd Smeets Sun, 05 Dec 2021 08:28:46 -0800

Hi all,

I'm debugging the relevancy scores of my query and I see the following for
two documents hits. My question is, why is the idf score not the same for
both documents? This is Solr 6.6.


Any guidance would be much appreciated.

Thanks!

*Doc1*
"71d72354eea23b9eae934ab616e8ce38de69d760": "
104.994415 = sum of:
  104.994415 = sum of:
    82.89969 = weight(stemmed_data.timenote.narratives:remedi in 22470)
[SchemaSimilarity], result of:
      82.89969 = score(freq=9.0), computed as boost * idf * tf from:
        100.0 = boost
        0.87546873 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
from:
          *52 = n, number of documents containing term*
          *125 = N, total number of documents with field*
        0.9469177 = tf, computed as freq / (freq + k1 * (1 - b + b * dl /
avgdl)) from:
          9.0 = freq, occurrences of term within document
          1.2 = k1, term saturation parameter
          0.75 = b, length normalization parameter
          12312.0 = dl, length of field (approximate)
          54179.03 = avgdl, average length of field
    22.09473 = weight(stemmed_data.timenote.matters:remedi in 22470)
[SchemaSimilarity], result of:
      22.09473 = score(freq=4.0), computed as boost * idf * tf from:
        10.0 = boost
        2.4308395 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
from:
          *9 = n, number of documents containing term*
          *107 = N, total number of documents with field*
        0.9089341 = tf, computed as freq / (freq + k1 * (1 - b + b * dl /
avgdl)) from:
          4.0 = freq, occurrences of term within document
          1.2 = k1, term saturation parameter
          0.75 = b, length normalization parameter
          5656.0 = dl, length of field (approximate)
          50520.543 = avgdl, average length of field
  0.0 = FunctionQuery(int(s_integer_search.previews)), product of:
    0.0 = int(s_integer_search.previews)=0
    1.0 = boost
  0.0 = FunctionQuery(int(s_integer_search.downloads)), product of:
    0.0 = int(s_integer_search.downloads)=0
    1.0 = boost
"

*Doc2*
"80302a1ecc44d1e556970ab96c25b1fd3328a854": "
84.61461 = sum of:
  84.61461 = sum of:
    64.68881 = weight(stemmed_data.timenote.narratives:remedi in 0)
[SchemaSimilarity], result of:
      64.68881 = score(freq=493.0), computed as boost * idf * tf from:
        100.0 = boost
        0.65094686 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
from:
          *60 = n, number of documents containing term*
          *115 = N, total number of documents with field*
        0.99376476 = tf, computed as freq / (freq + k1 * (1 - b + b * dl /
avgdl)) from:
          493.0 = freq, occurrences of term within document
          1.2 = k1, term saturation parameter
          0.75 = b, length normalization parameter
          229400.0 = dl, length of field (approximate)
          73913.91 = avgdl, average length of field
    19.9258 = weight(stemmed_data.timenote.matters:remedi in 0)
[SchemaSimilarity], result of:
      19.9258 = score(freq=340.0), computed as boost * idf * tf from:
        10.0 = boost
        2.0024805 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
from:
          *13 = n, number of documents containing term*
          *99 = N, total number of documents with field*
        0.99505585 = tf, computed as freq / (freq + k1 * (1 - b + b * dl /
avgdl)) from:
          340.0 = freq, occurrences of term within document
          1.2 = k1, term saturation parameter
          0.75 = b, length normalization parameter
          147480.0 = dl, length of field (approximate)
          95534.95 = avgdl, average length of field
  0.0 = FunctionQuery(int(s_integer_search.previews)), product of:
    0.0 = int(s_integer_search.previews)=0
    1.0 = boost
  0.0 = FunctionQuery(int(s_integer_search.downloads)), product of:
    0.0 = int(s_integer_search.downloads)=0
    1.0 = boost
"

Relevancy debugging - idf score

Reply via email to