On Sat, Oct 8, 2011 at 3:37 AM, Joel Halbert <j...@su3analytics.com> wrote: > Hi, > > Does anyone have a modified scoring (Similarity) function they would > care to share? > > I'm searching web page documents and find the default Similarity seems > to assign too much weight to documents with frequent occurrence of a > single term from the query and not enough weight to documents that > contain a greater overlap of the search query terms. > > I've been playing around with overriding the default but wondering if > anyone has an implementation they have found to work well that they > would care to share. >
have a look at coord(), you might want to further punish documents that don't contain all the query terms. something like: @Override public float coord(int overlap, int maxOverlap) { return (overlap == maxOverlap) ? 1f : 0.5f * super.coord(overlap, maxOverlap); } -- lucidimagination.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org