Ok thanks, I'll bear that in mind. I've raised https://issues.apache.org/jira/browse/SOLR-15558 and will add a patch when I have some free time
On Wed, 21 Jul 2021 at 15:56, Mike Drob <md...@mdrob.com> wrote: > That seems like a reasonable check to add, the only caution I would advise > is that a lot of developers use macs for local testing so make sure that > whatever flags you invoke are generally cross platform compatible, or > hidden behind appropriate conditions. > > On Wed, Jul 21, 2021 at 5:59 AM Colvin Cowie <colvin.cowie....@gmail.com> > wrote: > > > Hello, > > > > When calling solr stop on linux, this command is used > > *CHECK_PID=`ps auxww | awk '{print $2}' | grep -w $SOLR_PID | sort -r | > tr > > -d ' '`* > > > > > https://github.com/apache/solr/blob/122c88a0748769432ef62cc3fb94c2226dd67aa7/solr/bin/solr#L871 > > > > If Solr has stopped but remains as a zombie process then its process > entry > > will remain in the table, so *ps auxww* will continue to show the PID > even > > after kill -9. So that results in something like this, with 3 minutes > > wasted waiting for a dead process to exit. > > > > > > > > > > > > > > *[2021-07-21T09:15:12.365Z] Sending stop command to Solr running on port > > 8983 ... waiting up to 180 seconds to allow Jetty process 12622 to stop > > gracefully.[2021-07-21T09:18:13.551Z] [|] Solr process 12622 is still > > running; jstacking it now.[2021-07-21T09:18:21.806Z] 12622: Unable to > open > > socket file /proc/12622/root/tmp/.java_pid12622: target process 12622 > > doesn't respond within 10500ms or HotSpot VM not > > loaded[2021-07-21T09:18:21.806Z] Solr process 12622 is still running; > > forcefully killing it now.[2021-07-21T09:18:21.806Z] Killed process > > 12622[2021-07-21T09:18:31.678Z] ERROR: Failed to kill previous Solr Java > > process 12622 ... script fails.* > > > > But the output of ps auxww does identify Zombie processes under STAT: > > *USER PID %CPU %MEM VSZ RSS TTY STAT START TIME > COMMAND* > > *root 12622 1.4 0.0 0 0 pts/1 Z > > 10:42 0:26 [java] <defunct> * > > > > So the CHECK_PID could filter out Zombies. > > Obviously the bigger issue is why the process has ended up as a Zombie > (in > > this case it was because of > > > > > https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-reaping-problem/ > > and not specifying "--init" when running Solr inside a docker container) > so > > maybe a message warning that the process is a zombie is worth having, so > > that the user has an opportunity to do something about it. > > > > I guess I will raise a JIRA issue with a patch to do that unless there's > > some alternative suggestions? > > > > Regards, > > Colvin > > >