On Wed, Feb 17, 2010 at 10:31:19AM -0500, Robert Muir wrote: > yet if we don't do the hard work up front to make it easy to plug in things > like BM25, then no one will implement additional scoring formulas for > Lucene, we currently make it terribly difficult to do this.
FWIW... Similarity and posting format spec are so closely tied that I'm considering linking them in Lucy. Schema schema = new Schema(); FullTextType bm25Type = new FullTextType(new BM25Similarity()); schema.specField("content", bm25Type); schema.specField("title", bm25Type); StringType matchType = new StringType(new MatchSimilarity()); schema.specField("category", matchType); That way, custom scoring implementations can guarantee that they always have the posting information they need available to make their similarity judgments. Similarity also becomes a more generalized notion, with the TF/IDF-specific functionality moving into a subclass. Maybe something similar could be made to work in Lucene. Dunno how McCandless has things set up for spec'ing codecs on the flex branch. Marvin Humphrey --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org