Hi Jan, Thanks for thumbs up
On Tue, May 14, 2013 at 11:14 AM, Jan Høydahl <[email protected]> wrote: > Hello :) > > I think it has been the intention of the dev community for a long time to > start using the flex parser framework, and in this regard this contribution > is much welcome as a kickstarter for that. > I have not looked much at the code, but I hope it could be a starting > point for writing future parsers in a less "spaghetti" way. > > One question. Say we want to add a new operator such as NEAR/N. Ideally > this should be added in Lucene, then all the Solr QParsers extending the > lucene flex parser would benefit from the same new operator. Would this be > easily achieved with your code you think? We also have a ton of > to add a new operator is very simple on the syntax level -- ie. when I want the NEAR/x operator, I just change the ANTLR grammar, which produces the approripate abstract syntax tree. The flex parser is consuming this. Yet, imagine the following query dog NEAR/5 cat if you are using synonyms, an analyzer could have expanded dog with synonyms, it becomes something like (dog | canin) NEAR/5 cat and since Lucene cannot handle these queries, the flex builder must rewrite them, effectively producing SpanNear(SpanOr(dog | cat), SpanTerm(cat), 5) but you could also argue, that a better way to handle this query is: SpanNear(dog, cat, 5) OR SpanNear(canin, cat, 5) If that is the case, then a different builder will have to be used - Just an example where syntax is relatively simple, but the semantics is the hard part. But I believe the flex parser gives all necessary tools to deal with that and avoid the spaghetti problem --roman > feature requests on the eDisMax parser for new kinds of query syntax > support. Before we start implementing that on top of the > already-hard-to-maintain eDismax code, we should think about > re-implementing eDismax on top of flex, perhaps on top of Roman's contrib > here? > btw: i am using edismax in one of my grammars -- ie. users can type: query AND edismax(foo OR (dog AND cat)) -- and the "edismax(....)" will be parsed by edismax, but I hit the problems there as well, it is not doing such a nice job with operators and of course it doesn't know how to handle multi-token synonym expansion, but I think it could be nicely extracted into a flex processor and effectively become a plugin for a solr parser (now, it is a parser of its own, which makes it hard to extend) > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.cominvent.com > > 14. mai 2013 kl. 17:07 skrev Roman Chyla <[email protected]>: > > Hello World! > > Following the recommended practice I'd like to let you know that I am > about to start porting our existing query parser into JIRA with the aim of > making it available to Lucene/SOLR community. > > The query parser is built on top of the flexible query parser, but it > separates the parsing (ANTLR) and the query building - it allows for a very > sophisticated custom logic and has self-retrospecting methods, so one can > actually 'see' what is going on - I have had lots of FUN working with it > (which I consider to be a feature, not a shameless plug ;)). > > Some write up is here: > http://29min.wordpress.com/category/antlrqueryparser/ > > You can see the source code at: > > https://github.com/romanchyla/montysolr/tree/master/contrib/antlrqueryparser > > > If you think this project is duplicating something or even being useless > (I hope not!) please let me know, stop me, say something... > > Thank you! > > roman > > >
