Erik Hatcher writes: > > Your changes look great in general, though I find some issues: > > > > 1) 'stop OR stop AND stop' where stop is a stopword gives a parse > > error: > > Encountered "<EOF>" at line 1, column 0. > > Was expecting one of: > > <NOT> ... > > ... > > I think you must have tried this in a transient state when I forgot to > check in some JavaCC generated files. Try again. This one now returns > an empty BooleanQuery. > ok. I'm a bit puzzled, since I called javacc myself, so generated files should not matter, but if it's fixed, I don't care about what went wrong.
> > 2) Single term queries using +/- flags are parse to a query without > > flag > > +a -> a > > Hmmm.... this is a debatable one. It's returning a TermQuery in this > case for "a". Is that appropriate? Or should it return a BooleanQuery > with a single TermQuery as required? > I'd prefer, if query parser parses queries created by query.toString() to the same query. But that's just a nice to have. > I think having it optimized to a TermQuery makes the most sense. > Though, putting it in a BooleanQuery does make this next one simpler... > > > -a -> a > > While this doesn't make a difference for +a it's a bit strange for -a, > > OTOH -a isn't a usable query anyway. > > Oops... yeah, you're right. If its a single clause right now it > doesn't wrap in a BooleanQuery and thus does not take into account the > modifier +/-/NOT. But as you say, this is a bogus query anyway. I > guess the right thing to do is wrap both the +a query as above and the > -a query into a BooleanQuery with the modifier set appropriately. > Ok. The question how to handle BooleanQueries, that contain prohibited terms only, is a question on it's own. In my fix I choose to silently drop these queries. Basically because it's effectivly dropped during querying anyway. In an application, I handled this by dropping the query and notifying the user, that some part of the query could not be handled and was ignored. > > 3) a OR NOT b parses to 'a -b' which is the same as 'a AND NOT b' > > IMHO `a OR NOT b' should be `a OR (NOT b)' though lucene cannot > > search > > that. Maybe it should raise an error... > > Actually it parses like this: > > a OR NOT b -> a -b > a AND NOT b -> +a -b > > So they are slightly different, though the effect will be the same. > > > a OR NOT b AND c (parsed to a -(+b +c)) should IMHO be parsed to `a > > (-b +c)' > > Ah, ok.... so NOT gets much higher precedence than I'm currently giving > it. That might take me a while to achieve, but I'll give it a shot. > Great. Morus --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]