Hello, I'm having a hard time implementing / understanding a very simple custom scoring situation.
I have created my Similarity class for testing which overrides all the relevant (I think) methods below, returning 1 for all but coord(int, int) which returns q / maxOverlap so scores are scaled between 0. and 1.. I call writer.setSimilarity(new HashHitSimilarity()) when indexing and searcher.setSimilarity(new HashHitSimilarity()) when searching. The similarity is definitely affecting the scoring but not how I expect. I am looking for a straight average of the hits calculated, i.e. totalHits for a doc / totalHits in search. The above score with my test search and index of 6 docs should return the scores below for all 6 documents in my index: 0.8387096774193549 0.3548387096774194 0.3548387096774194 0.25806451612903225 0.1935483870967742 0.12903225806451613 but the scores appear "stretched" and return these instead though I'm unsure as to where this "stretching" happens: 0.9078212 0.75977653 0.57541895 0.5670391 0.5223464 0.37150836 public class HashHitSimilarity extends Similarity { /** * */ private static final long serialVersionUID = 811419737205284733L; public float tf(float freq) { return 1f; } public float lengthNorm(String fieldName, int numTokens) { return 1f; } public float queryNorm(float sumOfSquaredWeights) { return 1f; } @Override public float coord(int overlap, int maxOverlap) { return 1f / (float) maxOverlap; } @Override public float idf(int docFreq, int numDocs) { return 1f; } @Override public float sloppyFreq(int distance) { return 0f; } } -- TH!NKMAP Christopher Tignor | Senior Software Architect 155 Spring Street NY, NY 10012 p.212-285-8600 x385 f.212-285-8999