[ https://issues.apache.org/jira/browse/SOLR-16099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646141#comment-17646141 ]
ASF subversion and git services commented on SOLR-16099: -------------------------------------------------------- Commit 7a96c5112b8544ddf0bc1c9ba5e80c4d1b454cd9 in solr's branch refs/heads/main from Kevin Risden [ https://gitbox.apache.org/repos/asf?p=solr.git;h=7a96c5112b8 ] SOLR-16099: Upgrade to Jetty 10.0.13 (#1230) > HTTP Client threads can hang in Jetty's InputStreamResponseListener when > using HTTP2 - impacts intra-node communication > ----------------------------------------------------------------------------------------------------------------------- > > Key: SOLR-16099 > URL: https://issues.apache.org/jira/browse/SOLR-16099 > Project: Solr > Issue Type: Bug > Components: SolrCloud, SolrJ > Reporter: Chris M. Hostetter > Assignee: Kevin Risden > Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > There apearrs to be a Jetty HttpClient bug that makes it possible for a > request thread to hang indefinitely while waiting to parse the response from > remote jetty servers. The cause of the hung thread is because it calls > {{.wait()}} on monitor lock that _should_ be notified by another (internal > jetty client) thread when a chunk of data is available from the wire – but in > some cases this evidently may not happen. > In the case of {{distrib=true}} requests processed by Solr (aggregating > multiple per-shard responses from other nodes) this can manifest with stack > traces that look like the following (taken from Solr 8.8.2)... > {noformat} > "thread",{ > "id":14253, > "name":"httpShardExecutor-7-thread-13819-...", > "state":"WAITING", > > "lock":"org.eclipse.jetty.client.util.InputStreamResponseListener@12b59075", > "lock-waiting":{ > > "name":"org.eclipse.jetty.client.util.InputStreamResponseListener@12b59075", > "owner":null}, > > "synchronizers-locked":["java.util.concurrent.ThreadPoolExecutor$Worker@1ec1aed0"], > "cpuTime":"65.4882ms", > "userTime":"60.0000ms", > "stackTrace":["java.base@11.0.14/java.lang.Object.wait(Native Method)", > "java.base@11.0.14/java.lang.Object.wait(Unknown Source)", > > "org.eclipse.jetty.client.util.InputStreamResponseListener$Input.read(InputStreamResponseListener.java:318)", > > "org.apache.solr.common.util.FastInputStream.readWrappedStream(FastInputStream.java:90)", > > "org.apache.solr.common.util.FastInputStream.refill(FastInputStream.java:99)", > > "org.apache.solr.common.util.FastInputStream.readByte(FastInputStream.java:217)", > "org.apache.solr.common.util.JavaBinCodec._init(JavaBinCodec.java:211)", > > "org.apache.solr.common.util.JavaBinCodec.initRead(JavaBinCodec.java:202)", > > "org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:195)", > > "org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:51)", > > "org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse(Http2SolrClient.java:696)", > > "org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:412)", > > "org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:761)", > > "org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:369)", > > "org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:297)", > > "org.apache.solr.handler.component.HttpShardHandlerFactory.makeLoadBalancedRequest(HttpShardHandlerFactory.java:371)", > > "org.apache.solr.handler.component.ShardRequestor.call(ShardRequestor.java:132)", > > "org.apache.solr.handler.component.ShardRequestor.call(ShardRequestor.java:41)", > "java.base@11.0.14/java.util.concurrent.FutureTask.run(Unknown Source)", > > "java.base@11.0.14/java.util.concurrent.Executors$RunnableAdapter.call(Unknown > Source)", > "java.base@11.0.14/java.util.concurrent.FutureTask.run(Unknown Source)", > > "com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:180)", > > "org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)", > > "org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$175/0x0000000840243c40.run(Unknown > Source)", > > "java.base@11.0.14/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown > Source)", > > "java.base@11.0.14/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown > Source)", > "java.base@11.0.14/java.lang.Thread.run(Unknown Source)"]}, > {noformat} > ...these {{httpShardExecutor}} threads can stay hung, tying up system > resources, indefinitely (unless they get a spuriuos {{notify()}} from the > JVM). (In once case, it seems to have caused a request to hang for {*}10.3 > hours{*}) > Anecdotally: > * There is some evidence that this problem did _*NOT*_ affect Solr 8.6.3, > but does affect later versions > ** suggesting the bug didn't exist in Jetty until _after_ 9.4.27.v20200227 > * Forcing the Jetty HttpClient to use HTTP1.1 transport seems to prevent > this problem from happening > ** In Solr this can be done by setting the {{"solr.http1"}} system property > ** Or using the {{Http2SolrClient.Builder.useHttp1_1()}} method in client > application code -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org