: I'm happy to provide some details as I still do not really understand the
: difference to the situation before.

The main difference is coming from the changes introduced in LUCENE-8811 
(Lucene 9.0) which sought to ensure that the "global" maxClauseCount would 
be honored no matter what kind of nested structure the query might 
involve.

You're situation is an interesting case that i had never considered, more 
detais below...

: * I upgraded from 8.11.1 to 9.1. I observed the behavior for a completely
: rebuild index (solr version 9.1 / lucene version 9.3)

thank you for clarifing.  This confirms that changes introduced 
in LUCENE-8811 (and related solr issues) are relavant to the change in 
behavior you are seeing (if you had said you upgraded from Solr 9 we'd be 
having a different conversation)

: * maxBooleanClauses is only configured in solrconfig.xml (1024) but not in
: solr.xml.

FYI: If you don't configure in solr.xml, then the (Lucene) default
IndexSearcher.getMaxClauseCount() is left as is (and that is also 1024)

: * Sorry for the confusion about the field definition. As you already
: assumed correctly: 'categoryId' is also a 'p_long_dv'

Meaning that it has both points nad docvalues configured, which it turns 
out is significant to why it behaves differently from a string field.


: * Stacktrace for String field ("id"). For better readability I replaced the
: original query by "1 2 ... 1025":

Snipping down to the key lines of code from the root cause...

: Caused by: org.apache.lucene.search.IndexSearcher$TooManyClauses:
: maxClauseCount is set to 1024
:         at
: org.apache.lucene.search.BooleanQuery$Builder.add(BooleanQuery.java:116)
:         at
: org.apache.lucene.search.BooleanQuery$Builder.add(BooleanQuery.java:130)
:         at
: 
org.apache.solr.parser.SolrQueryParserBase.rawToNormal(SolrQueryParserBase.java:1065)

...so in this case, as the query parser is building up a boolean query (of 
many strings), it is hitting the limit because the (top level) boolean 
query is being asked to add one more item then 
IndexSearcher.getMaxClauseCount() == 1024


: * Stacktrace for Point field ("categoryId") with 1 2 ... 513:

Again, snipping down to just the key lines of code.  (Note also the 
difference in the exception message: "too many nested clauses") ..

: org.apache.lucene.search.IndexSearcher$TooManyNestedClauses: Query contains
: too many nested clauses; maxClauseCount is set to 1024
:         at
: org.apache.lucene.search.IndexSearcher$3.visitLeaf(IndexSearcher.java:801)
:         at
: 
org.apache.lucene.document.SortedNumericDocValuesRangeQuery.visit(SortedNumericDocValuesRangeQuery.java:73)
:         at
: 
org.apache.lucene.search.IndexOrDocValuesQuery.visit(IndexOrDocValuesQuery.java:121)
:         at
: org.apache.lucene.search.BooleanQuery.visit(BooleanQuery.java:575)
:         at
: org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:769)

...here the exception is happening during the actual search -- meaning the 
query parser had no problem building up the BooleanQuery of 512 clauses

But what matters is that each of those 512 clauses is no longer a simple 
exact term query (or a simple exact point query, or a simple exact 
docvalue query) ... because this fieldType is configured to support both 
points and docvalues, those 512 clauses are IndexOrDocValuesQuery queries 
-- which each contain 2 sub-clauses

(the purpose of this class is to provide teh most efficient impl based on 
where/how this clause is used, which can depend on term stats, other 
clauses in the parent query, etc...)

So to sumarize:

1) the reason you're seeing this behavior in 9x but didnt' in 8x is 
because 9x added more checks of the safety valve

2) the reason you're seeing the 1024 limit hit for some (but not all) 
fields, even with with less then 1024 "original user query clauses" is 
because for some (but not all) field types, 1 original query clause can 
become N internal clauses.


-Hoss
http://www.lucidworks.com/

Reply via email to