Turn on some system monitoring. For that large a difference, look at IO waits or IO reads from disk, whatever your monitoring system provides.
Also look at CPU, the CPU utilization should be the same on all the nodes. If some nodes are lower in CPU, then they are blocked by something else. Make sure that the nodes really are the same, especially that the disk volumes are the same type, that they have the same amount of RAM, and that they have the same non-Solr processes running (with the same load). We’ve seen some cohorts of machines that were 2X slower in our big cluster, but that was because they got allocated on a different EC2 instance type. Oops. We do see some persistent cohorts with different performance, but nothing like 30 ms vs 1000 ms. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Sep 29, 2023, at 9:24 PM, Shawn Heisey <apa...@elyograg.org.INVALID> wrote: > > On 9/29/23 13:28, rajani m wrote: >> What could cause some nodes in a cluster to have high query latency >> when compared to the rest? > > It is possible that some of the nodes have either handled zero queries since > the last reboot, or that they have handled fewer queries than the others, and > as a result they have less of the index sitting in the OS disk cache. When > Solr actually has to go out to the disk to read index data, it is MUCH slower > than when it can simply read that data directly from memory. > > Even SSD storage, prized for its performance, is slower than main memory. > > https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems > > Thanks, > Shawn >