Mike Baranczak wrote:

I'm building a search engine that searches multiple document fields by default. Given a query string like "Bruce Lee", I would expect the results list to first show the documents containing both "Bruce" and "Lee", and then the documents which only contain one of those names. Most of the time, this is indeed what happens, but I've noticed that in certain circumstances, Lucene doesn't rank the results in the expected order. Specifically, it happens when I enter a query containing multiple words, searching multiple fields, AND try to put the results of that through a filter.

Code example is below. Is this a bug, or am I doing something wrong?

I suspect the existence of the filter is only relevant to the extent that you are ranking a different set of results. MultiFieldQueryParser does not in general provide the ranking you are looking for, especially for default OR queries. I found this to be a problem as well and created alternative classes, DistributedMultiFieldQueryParser and MaxDisjunctionQuery, which are available here: http://issues.apache.org/bugzilla/show_bug.cgi?id=32674


You might check these out and see if they provide the ranking you are looking for (I think they will). They were written for 1.4.3; they should work in the trunk (1.9) but will get deprecation warnings. I'm using them now in 1.9 and have newer clean versions which I'd be happy to post if anybody else is using them (although the versions I use still require java 1.5).

Chuck


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to