Erik Hatcher writes:ok.Your changes look great in general, though I find some issues:
1) 'stop OR stop AND stop' where stop is a stopword gives a parse error: Encountered "<EOF>" at line 1, column 0. Was expecting one of: <NOT> ... ...
I think you must have tried this in a transient state when I forgot to
check in some JavaCC generated files. Try again. This one now returns
an empty BooleanQuery.
I'm a bit puzzled, since I called javacc myself, so generated files should
not matter, but if it's fixed, I don't care about what went wrong.
Let me know if there is still an issue, though I added this exact case to TestPrecedenceQueryParser and its currently working for me.
2) Single term queries using +/- flags are parse to a query without flag +a -> a
Hmmm.... this is a debatable one. It's returning a TermQuery in this
case for "a". Is that appropriate? Or should it return a BooleanQuery
with a single TermQuery as required?
I'd prefer, if query parser parses queries created by query.toString() to the same query. But that's just a nice to have.
It's also an impossibility to have. Here's a simple example, take a Query that is equivalent to A OR B, .toString equals "A B", then parse that with the default operator set to AND and you'll get "+A + B". I created a modified Query->String converter for my current day time project (as I use a String representation for the most recently used drop-down that is stored as a client-side cookie) that explicitly puts in "OR" between SHOULD BooleanClauses.
I still believe that we need to have some query-parser-specific way to build strings from Query objects, though I haven't thought through exactly how that should be designed. For example, I'm building a very custom query parser for a client that looks nothing like QueryParser syntax. It would be very nice to be able to turn a Query back around into their expression syntax.
I think having it optimized to a TermQuery makes the most sense.Ok.
Though, putting it in a BooleanQuery does make this next one simpler...
-a -> a
While this doesn't make a difference for +a it's a bit strange for -a,
OTOH -a isn't a usable query anyway.
Oops... yeah, you're right. If its a single clause right now it doesn't wrap in a BooleanQuery and thus does not take into account the modifier +/-/NOT. But as you say, this is a bogus query anyway. I guess the right thing to do is wrap both the +a query as above and the -a query into a BooleanQuery with the modifier set appropriately.
The question how to handle BooleanQueries, that contain prohibited terms
only, is a question on it's own.
In my fix I choose to silently drop these queries. Basically because it's
effectivly dropped during querying anyway.
Silently drop as in you removed them entirely from the resultant Query?
That'd be easy enough to add - but is that what we want to happen? Community, thoughts?
In an application, I handled this by dropping the query and notifying the
user, that some part of the query could not be handled and was ignored.
How did your application notice that part of the query was dropped?
Great.3) a OR NOT b parses to 'a -b' which is the same as 'a AND NOT b' IMHO `a OR NOT b' should be `a OR (NOT b)' though lucene cannot search that. Maybe it should raise an error...
Actually it parses like this:
a OR NOT b -> a -b a AND NOT b -> +a -b
So they are slightly different, though the effect will be the same.
a OR NOT b AND c (parsed to a -(+b +c)) should IMHO be parsed to `a
(-b +c)'
Ah, ok.... so NOT gets much higher precedence than I'm currently giving
it. That might take me a while to achieve, but I'll give it a shot.
I've shifted my local parser grammar around some, and have broken other tests, but do have the NOT precedence working. Here's a testSimple case that I broke by making NOT have higher precedence (I shifted where Modifiers are taken into account - before a Clause now):
Query /+term -term term/ yielded /(+term) (-term) term/, expecting /+term -term term/
As you can see this is wrong and I have more work to do. A OR NOT B now parses to A (-B) though, which I too now believe is a more correct (though invalid) interpretation.
Erik
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]