Hi again,As the subject would suggest I'm trying to implement a layer of proximity weighting over lucene. This has greatly increased search relevance, but at the same time has knocked down performance by a substantial amount (see footer).
I am using a hand rolled query of the following form (implemented with SpanNearQuery, not a sloppy PhraseQuery): a b c => +(a AND b AND c) OR "a b"~5 OR "b c"~5 The obvious solution, "a b c"~5, is not applicable for my issues, because I would like to allow for the possibility that a and b are near each other in one field, while c is in another field. So, is there something I'm missing to make this performant? Would a reordering, query rewriting solution help? If there's no solution in existing Lucene, would anyone be interested in investigating options with me? -Kyle Somewhat arbitrary benchmarks. -------------- Before: $ ./bench.rb "paris hilton" 0.022000 0.000000 0.022000 ( 0.021000) $ ./bench.rb "paris hilton goes to jail" 0.024000 0.000000 0.024000 ( 0.024000) After: $> ./bench.rb "paris hilton" 0.103000 0.000000 0.103000 ( 0.103000) $> ./bench.rb "paris hilton goes to jail" 1.514000 0.000000 1.514000 ( 1.513000)