On Friday 13 October 2006 01:55, Mark Miller wrote: > There is also the Surround Query Parser in contrib by the way...I would bet > that Paul will tell you that it does not have these issues. I can't wait to
Indeed. > see the replies on this one...I didn't realize that the QueryParser had > these problems and am a bit skeptical...unfortunately I am away from home > and cannot check it out. Surround is not perfect, either. One of the disadvantages of surround is that it does not map to PhraseQuery. > On another note...http://famestalker.com/devwik/ will be done soon...I only The url gives a not found 404 error here. > have not gotten around to finishing the final touches because there did not > appear to be a lot of initial interest (and what there was has waned > drastically) and I am not ready to use it myself yet. It does correctly > handle order of operations however, and as far as I know is the only parser > to handle arbitray nesting and mixing of boolean and proximity queries. > (perhaps surround does as well...I would be really interested to know, but I > assume that it handles only the base cases ie not "(car & basket) within 2 > of (horse & carriage within 3 of car). Of course who really cares about such > queries, but hey ;) Surround maps proximity queries to SpanNearQuery, and that only allows OR'ing in its operands. Surround does not map to SpanNotQuery, but it will parse nested proximity queries and generate a nested SpanNearQuery. > You'll get better advice from others more experienced, but my bet is that > Paul's surround parser is top notch and correctly does what you want. > > - Mark > > > > > On 10/12/06, Renaud Waldura <[EMAIL PROTECTED]> wrote: > > > > I'm developing an application used by scientists -- people who have a > > pretty > > good idea of what logic is -- and they were shocked to find out that > > neither > > of these queries return the same results: > > > > 1- banana AND apple OR orange > > 2- banana AND (apple OR orange) > > 3- (banana AND apple) OR orange > > > > I'd expect (1) to be either (2) or (3), but it turns out it's parsed as > > "+banana apple orange". I was rather, uh, dismayed by this find, as it > > doesn't seem to make sense. > > > > I just spent half a day reading up on the various ways QueryParser is > > broken, by going through the bugs and the mailing-list archives. And I'm > > still unable to come to a conclusion. Here's where I'm at: > > > > a- queries which mix boolean operators require strict parenthesizing > > to > > work right > > > > b- "+" isn't shorthand for "AND"; using it with "AND"/"OR"/"NOT" and > > the > > default operator "" rarely does what you expect > > > > c- the stock QueryParser doesn't work well in these cases > > > > d- there's a new PrecedenceQueryParser at > > http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/miscellaneousthat > > solves *some* of the issues but creates others > > > > e- there is a non-Lucene effort to create a query parser with a > > different syntax at http://famestalker.com/devwiki/ > > > > While we are also developing a query-building UI, users must be able to > > enter text queries as well. What do other folks do? I mean, this is pretty > > bad. I can hardly go back to my scientists and tell them Lucene is unable > > to > > handle 2 boolean operators, that they should parenthesize everything by > > hand. I mean, that's just cheesy. For a query building UI it might be better to output queries in XML form to a Lucene server, see contrib/xml-query-parser . Regards, Paul Elschot --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]