[ 
https://issues.apache.org/jira/browse/SOLR-17916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SOLR-17916:
----------------------------------
    Labels: pull-request-available  (was: )

> Jetty 12.0.25 upgrade exposes RST_STREAM burst issue
> ----------------------------------------------------
>
>                 Key: SOLR-17916
>                 URL: https://issues.apache.org/jira/browse/SOLR-17916
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Sanjay Dutt
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> After upgrading Jetty from {*}12.0.19 → 12.0.25{*}, the test 
> {{DistributedDebugComponentTest.testTolerantSearch}} starts failing.
> The test sets up a query with a deliberately bad shard:
> {code:java}
> String badShard = DEAD_HOST_1 + "/solr/collection1";
> query.set("shards", badShard+ "," + shard2 + "," + shard1);
> for (int i = 0; i < (TEST_NIGHTLY ? 500 : 200); i++) {
>       // verify that the request would fail if shards.tolerant=false
>       query.set(ShardParams.SHARDS_TOLERANT, "false");
>       ignoreException("Connection refused");
>       expectThrows(SolrException.class, () -> collection1.query(query));
>       // verify that the request would succeed if shards.tolerant=true
>       query.set(ShardParams.SHARDS_TOLERANT, "true");
>       QueryResponse response = collection1.query(query); // fail here!
> ....
> {code}
> For each iteration, it issues:
>  * *shards.tolerant = false* → as expected, the coordinator fails fast 
> because one shard is dead.
>  * *shards.tolerant = true* → expected to succeed using results from the good 
> shard(s), but {*}fails after the Jetty upgrade{*}.
> *Observed behavior*
>  * In the non-tolerant branch, {{SearchHandler}} throws early on the shard 
> exception.
>  * At this point {{HttpShardHandler}} cancels the outstanding async requests 
> to the other shards, calling {{future.cancel(true)}} / 
> {{{}request.abort(){}}}.
>  * That abort translates into *RST_STREAM* frames sent to Jetty.
>  * With the loop running hundreds of iterations, these cancels accumulate on 
> a single HTTP/2 session.
>  * Jetty 12.0.25 enforces stricter HTTP/2 rate control:
> GoAwayFrame\{... enhance_your_calm_error/invalid_rst_stream_frame_rate}
>  * Once the rate limit is tripped, the server responds with GOAWAY and closes 
> the connection.
>  * The subsequent tolerant request then fails, even though at least one shard 
> is healthy.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to