What happens on 9.x?? :)

> On Nov 25, 2022, at 11:33 AM, Richard Goodman <richa...@brandwatch.com> wrote:
> 
> Hi there,
> 
> We have a cluster spread over 72 instances on k8s hosting around 12.5
> billion documents (made up of 30 collections, each collection having 12
> shards). We were originally using 7.7.2 and performance was okay enough for
> us for our business needs. We then recently upgraded our cluster to
> v8.11.2, and have noticed a drop in performance. I appreciate that there
> have been a lot of changes from 7.7.2 to 8.11.2, but I have been collecting
> metrics, and although the configuration (instance type and resource
> allocation, start up opts) are the same, we are completely at a loss as to
> why it's performing worse, and was wondering if anyone had any guidance?
> 
> I recently stumbled across the tickets;
> 
>   - SOLR-15840 <https://issues.apache.org/jira/browse/SOLR-15840> -
>   Performance degradation with http2
>   - SOLR-16099 <https://issues.apache.org/jira/browse/SOLR-16099> - HTTP
>   Client threads can hang
> 
> In particular which sparked interest, and so we spun up a parallel cluster
> with -Dsolr.http1=true, and there was no difference in performance. We're
> testing a couple of other ideas, such as different DirectoryFatory *(as I
> saw a message from someone in the Solr Slack about there being an issue
> with the MMap directory and vm.max_map_count)*, some GC settings, but are
> really open to any suggestions. We're also happy if it'll help with any
> performance related topics to use this cluster to test patches at a large
> scale to see if it'll help with performance *(more specifically to the two
> Solr tickets listed above)*.
> 
> I thought it would be useful to show some metrics I collected where we had
> 2 clusters spun up, 1 being 7.7.2 and 1 being 8.11.2 where the 8.11.2
> cluster was the active, and all traffic was being shadow loaded into the
> 7.7.2 cluster to compare against. It's important to note that both clusters
> had the same configuration, here is a list to name a few:
> 
>   - G1GC garbage collector
>   - TLOG replication
>   - 27Gi Memory per instance
>   - 16Gi assigned to -XmX and -Xms
>   - 16 cores
>   - -XX:G1HeapRegionSize=4m
>   - -XX:G1ReservePercent=20
>   - -XX:InitiatingHeapOccupancyPercent=35
> 
> One metric that did stand out, was that 8.11.2 was churning through *a lot* of
> eden space in the heap, which can be seen in some of the screenshots of
> metrics below;
> 
> Total Memory Usage:
> 7.7.2
> 
> 
> 8.11.2
> 
> 
> Total Used G1 Pools
> 7.7.2
> 
> 
> 8.11.2
> 
> 
> And finally, the overall thread pool
> 7.7.2
> 
> 
> 8.11.2
> 
> 
> Any guidance or requests to test for performance wise would be appreciated.
> 
> Thanks,
> 
> Richard

Reply via email to