I'm using MultiFieldQueryParser to parse search queries. I find that certain query strings (e.g., "/study/" without the quotes) cause MultiFieldQueryParser.parse() to throw an AssertionError, if asserts are enabled. In production, parse() returns a Query, but it seems to be corrupt. using it to search my index results in an NPE. This seems related to regular expressions. That query string is probably invalid regex syntax. but shouldn't MultiFieldQueryParser to throw a ParseException in this case?
Here's a simple example that reproduces the assertion: // Turn on asserts ClassLoader loader = ClassLoader.getSystemClassLoader(); loader.setDefaultAssertionStatus(true); try { Analyzer analyzer = new WhitespaceAnalyzer(Version.LUCENE_41); QueryParser parser = new MultiFieldQueryParser(Version.LUCENE_41, new String[]{"title", "body"}, analyzer); Query query = parser.parse("/study/"); } catch (ParseException e) { System.out.println("Syntax error, please rephrase your query"); } This produces: Exception in thread "main" java.lang.AssertionError at org.apache.lucene.search.MultiTermQuery.<init>(MultiTermQuery.java:252) at org.apache.lucene.search.AutomatonQuery.<init>(AutomatonQuery.java:65) at org.apache.lucene.search.RegexpQuery.<init>(RegexpQuery.java:90) at org.apache.lucene.search.RegexpQuery.<init>(RegexpQuery.java:79) at org.apache.lucene.search.RegexpQuery.<init>(RegexpQuery.java:69) at org.apache.lucene.queryparser.classic.QueryParserBase.newRegexpQuery(QueryPa rserBase.java:790) at org.apache.lucene.queryparser.classic.QueryParserBase.getRegexpQuery(QueryPa rserBase.java:1005) at org.apache.lucene.queryparser.classic.QueryParserBase.handleBareTokenQuery(Q ueryParserBase.java:1075) at org.apache.lucene.queryparser.classic.QueryParser.Term(QueryParser.java:359) at org.apache.lucene.queryparser.classic.QueryParser.Clause(QueryParser.java:25 8) at org.apache.lucene.queryparser.classic.QueryParser.Query(QueryParser.java:182 ) at org.apache.lucene.queryparser.classic.QueryParser.TopLevelQuery(QueryParser. java:171) at org.apache.lucene.queryparser.classic.QueryParserBase.parse(QueryParserBase. java:120) at QueryParserException.main(QueryParserException.java:21) Turn off the asserts and parse() returns "successfully". but subsequent use of that Query instance results in NPEs such as: java.lang.NullPointerException at java.util.TreeMap.getEntry(TreeMap.java:342) at java.util.TreeMap.get(TreeMap.java:273) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.terms( PerFieldPostingsFormat.java:215) at org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRe write.java:58) at org.apache.lucene.search.ConstantScoreAutoRewrite.rewrite(ConstantScoreAutoR ewrite.java:95) at org.apache.lucene.search.MultiTermQuery$ConstantScoreAutoRewrite.rewrite(Mul tiTermQuery.java:220) at org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:286) at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:429) at org.apache.lucene.search.FilteredQuery.rewrite(FilteredQuery.java:334) at org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:616) at org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher. java:663) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281) at org.labkey.search.model.LuceneSearchServiceImpl.search(LuceneSearchServiceIm pl.java:1160) This is appearing on production deployments with reasonable (from a user's perspective) search queries (e.g., "http://labkey.org/study/xml" without the quotes). I'd like to either turn off regex parsing altogether or detect the syntax error at parse time so I can provide my standard syntax guidance back to the user. Thanks, Adam