I agree with Shawn about ideally wanting more memory for the OS. That said, the WordDelimiterFilter config you sent aligns with my suspicion that "graph phrase" issues are likely to explain the difference between 6.5 and 8.11. At query-time, WordDelimiterFilter (and also equally WordDelimiterGraphFilter) both trigger "graph phrase" behavior on `pf` (phrase fields), and in 6.5 these would I'm fairly certain have been completely ignored.
So 6.5 as a point of comparison is unlikely to be helpful going forward, since the "better performance" of 6.5 was a consequence of a bug that caused `pf` "graph phrase" queries not being executed at all. This mailing list exchange from June 2021 [1] should be helpful/relevant. (Also note that wrt the issue you're encountering, there's no real difference between WordDelimiterFilter and WordDelimiterGraphFilter). [1] https://lists.apache.org/thread/kbjgztckqdody9859knq05swvx5xj20f On Sun, Mar 27, 2022 at 11:51 AM Shawn Heisey <apa...@elyograg.org> wrote: > On 3/27/2022 5:30 AM, Modassar Ather wrote: > > The wildcard queries are executed against the text data and yes there > are a > > huge number of possible expansions of the wildcard query. > > All the 12 shards are on a single machine with 521 GB memory and each > shard > > is started with SOLR_JAVA_MEM="-Xmx30g". So the 521 GB memory is shared > by > > all the 12 shards. > > I believe that my initial thought is correct -- you need more memory to > handle 4TB of index data. I'm talking about more memory available to > the OS, not Solr. This would have most likely been a problem in 6.x > too, but I've seen situations where upgrading Solr can mean that > insufficient memory is even more of a noticeable problem than it was in > an older version. > > Something you could try is increasing the heap size to 31g. I wouldn't > suggest going any higher unless you see evidence that you actually need > more .. Java switches to 64-bit pointers at a heap size of 32GB, and you > probably need to go to something like 48GB before things break even. I > actually don't expect going to a 31GB heap to make things better ... but > if it does, then you might also be running into the other main problem > mentioned on the wiki page -- a heap size that's too small. That makes > it so Java spends more time collecting garbage than it does running the > application. > > I didn't know about the things Michael mentioned regarding Solr not > utilizing the full capability of WordDelimiterFilter and > WordDelimiterGraphFilter in older versions. Those filters tend to > greatly increase cardinality, and apparently also increase heap memory > utilization in recent Solr versions. > > Thanks, > Shawn > >