[
https://issues.apache.org/jira/browse/SOLR-9135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300443#comment-15300443
]
Ronald Braun commented on SOLR-9135:
------------------------------------
I can do a "uname -a" successfully if I ssh into the running container and try
it. It seems that the container state is such that any exec attempted by solr
itself fails (the process enters immediate D state), after which the admin page
hangs for a few seconds before timing out and trying again. This is almost
certainly a by-product of a problematic setup on our part which we are sorting
out. My main concern was that the admin was forking the external process to
begin with on a simple admin page load, and that if it failed, it was
continuing to try it ad infinitum, thus consuming our process pool and blocking
any ability to access admin functions. It seems like an unnecessary coupling
to external system state given the data being fetched, best avoided.
> SystemInfoHandler can poison / consume Jetty thread pool
> --------------------------------------------------------
>
> Key: SOLR-9135
> URL: https://issues.apache.org/jira/browse/SOLR-9135
> Project: Solr
> Issue Type: Bug
> Environment: Solr 6.0.0
> Reporter: Ronald Braun
> Priority: Minor
>
> We are running solr 6.0.0 in solr cloud mode within a docker container. We
> encountered an issue whereby the SystemInfoHandler was forking out processes
> that would immediately enter D (uninterruputable sleep) due to a container
> volume issue after hitting the admin manager in a browser. The thread stays
> in runnable state:
> {noformat}
> "qtp43368234-13611" #13611 prio=5 os_prio=0 tid=0x00007f0260011800 nid=0x36fb
> ru
> nnable [0x00007efa0bce1000]
> java.lang.Thread.State: RUNNABLE
> at java.lang.UNIXProcess.forkAndExec(Native Method)
> at java.lang.UNIXProcess.<init>(UNIXProcess.java:248)
> at java.lang.ProcessImpl.start(ProcessImpl.java:134)
> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
> at java.lang.Runtime.exec(Runtime.java:620)
> at java.lang.Runtime.exec(Runtime.java:450)
> at java.lang.Runtime.exec(Runtime.java:347)
> at
> org.apache.solr.handler.admin.SystemInfoHandler.execute(SystemInfoHan
> dler.java:244)
> at
> org.apache.solr.handler.admin.SystemInfoHandler.getSystemInfo(SystemI
> nfoHandler.java:198)
> at
> org.apache.solr.handler.admin.SystemInfoHandler.handleRequestBody(Sys
> temInfoHandler.java:111)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandl
> erBase.java:155)
> at
> org.apache.solr.handler.admin.InfoHandler.handleRequestBody(InfoHandl
> er.java:86)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandl
> erBase.java:155)
> at
> org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.
> java:658)
> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:441)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte
> r.java:229)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte
> r.java:184)
> at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(Servlet
> Handler.java:1668)
> at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java
> :581)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.j
> ava:143)
> at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.jav
> a:548)
> at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandl
> er.java:226)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandl
> er.java:1160)
> at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:
> 511)
> at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandle
> r.java:185)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandle
> r.java:1092)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.j
> ava:141)
> at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(Cont
> extHandlerCollection.java:213)
> at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerColl
> ection.java:119)
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper
> .java:134)
> at org.eclipse.jetty.server.Server.handle(Server.java:518)
> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
> at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.jav
> a:244)
> at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(Abstra
> ctConnection.java:273)
> at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
> at
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoin
> t.java:93)
> at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceA
> ndRun(ExecuteProduceConsume.java:246)
> at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(Exec
> uteProduceConsume.java:156)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPoo
> l.java:654)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool
> .java:572)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> The problematic command being executed was 'uname -a'. The admin manager
> would throw up a "Lost connection to solr" message but presumably retries the
> connection periodically (at least a couple of times a minute). Before we
> figured out what was going on, we had 600+ threads in D state:
> {noformat}
> 4433 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4434 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4439 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4440 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.04
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4461 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4462 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4467 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4470 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4486 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4487 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.06
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4488 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4489 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4496 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4497 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4501 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> etc.
> {noformat}
> An OS exec call is a bit heavy for loading the admin page... Might you
> consider either:
> - load this info once at startup and store
> - use a collapsed panel for display and fetch only on expansion / request
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]