[ https://issues.apache.org/jira/browse/SOLR-17419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17879926#comment-17879926 ]
ASF subversion and git services commented on SOLR-17419: -------------------------------------------------------- Commit 245956c6efa8a8073784e67e7b6aa06469c1e297 in solr's branch refs/heads/branch_9x from Jason Gerlowski [ https://gitbox.apache.org/repos/asf?p=solr.git;h=245956c6efa ] SOLR-17419: Introduce ParallelHttpShardHandler (#2681) The default ShardHandler implementation, HttpShardHandler, sends all shard-requests serially, only parallelizing the waiting and parsing of responses. This works great for collections with few shards, but as the number of shards increases the serialized sending of shard-requests adds a larger and larger overhead to the overall request (especially when auth and PKI are done at request-sending time). This commit fixes this by introducing an alternate ShardHandler implementation, geared towards collections with many shards. This ShardHandler uses an executor to parallelize both request sending and response waiting/parsing. This consumes more CPU, but reduces greatly reduces the latency/QTime observed by users querying many-shard collections. > Improve HttpShardHandler performance in many-shard collections > -------------------------------------------------------------- > > Key: SOLR-17419 > URL: https://issues.apache.org/jira/browse/SOLR-17419 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud > Affects Versions: 9.0, 9.6.1 > Reporter: Jason Gerlowski > Priority: Major > Labels: pull-request-available > Attachments: shardhandler-perf-graph.png > > Time Spent: 20m > Remaining Estimate: 0h > > In Solr 8, HttpShardHandler sends shard-requests by submitting Callables to > an ExecutorService. As a result, both the "request-sending" and > "response-awaiting" happened asynchronous to the original request-thread. > {code:java} > @Override > public void submit(final ShardRequest sreq, final String shard, final > ModifiableSolrParams params) { > ShardRequestor shardRequestor = new ShardRequestor(sreq, shard, params, > this); // Callable > try { > shardRequestor.init(); > pending.add(completionService.submit(shardRequestor)); > } finally { > shardRequestor.end(); > } > } > {code} > However, in Solr 9.x HttpShardHandler ditched the > ExecutorService/per-request-thread approach in favor of [sending all requests > serially using > "SolrClient.requestAsync"|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/handler/component/HttpShardHandler.java#L163]. > SOLR-14354, which made this change, did this in an effort to avoid > unnecessary thread and CPU context-switching. As Dat described in SOLR-14354: > {quote}after sending a request that thread basically do nothing just waiting > for response from other side. That thread will be swapped out and CPU will > try to handle another thread (this is called context switch, CPU will save > the context of the current thread and switch to another one). When some data > (not all) come back, that thread will be called to parsing these data, then > it will wait until more data come back. So there will be lots of context > switching in CPU. That is quite inefficient > {quote} > This approach comes with a downside though - all the shard requests are sent > serially. If sending each request takes ~1ms, then a user is unlikely to > notice this in their collection with 5 or 10 shards. But the cost here > scales linearly, so in *a collection with 50 shards - this approach would > bake a ~50ms delay into the critical path of every single query!* > This issue is intended to reevaluate whether there's a better way to balance > these concerns. Ideally we can come up with an approach that improves all > scenarios. Lacking that, maybe Solr could choose between one of several > approaches semi-intelligently based on the number of shards or other factors? -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org