With respect to the earlier post there seems to be a bug in lucene 1.9.1

I tried using the similarity below and changed idf to:
 public float idf(int docFreq, int numDocs) {
 float f = (float)(Math.log((double)numDocs/(double)(docFreq+1) + 1.0));
     return f;
   }

Now, when I print the explanantion for the top doc id, it includes every term in the query twice with a raw score of 11.50651, when some terms don't even appear in any docs. And the max raw score of the top doc is only 4.12327.

Anyone encounter this before?

Thanks

Eugene wrote:
Hi,

I tried implementing my own Similarity and setting it in IndexWriter.setSimilarity(new CosSimilarity()).

But, there's something weird, it doesn't seem to call the methods in my Similarity. For example, when I set the idf to return 0.0f the Similarity still gives me a score > 0.0f.

How do I correctly set the Similarity? I'm quite new to this, some links to implementing Similarity will also be useful.

Thanks.

--
Eugene

Here's the code for my CosSimilarity:

import org.apache.lucene.search.Similarity;

public class CosSimilarity extends Similarity
{
  public float lengthNorm(String fieldName, int numTerms) {
    return 1.0f;
  }

  public float queryNorm(float sumOfSquaredWeights) {
    return (float)(1.0 / Math.sqrt(sumOfSquaredWeights));
  }

  public float tf(float freq) {
    return (float)(1 + Math.log(1 + freq));
  }

  public float sloppyFreq(int distance) {
    return 1.0f / (distance + 1);
  }

  public float idf(int docFreq, int numDocs) {
float f = (float)(Math.log((double)numDocs/(double)(docFreq+1) + 1.0));
    System.out.println("CosSimilarity.idf>" + f);
    return 0.0f;
  }

  public float coord(int overlap, int maxOverlap) {
    return overlap / (float)maxOverlap;
  }

}

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to