Re: graceful shutdown in external data sources

Steve Loughran Fri, 18 Mar 2016 22:04:09 -0700

On 16 Mar 2016, at 23:43, Dan Burkert 
<d...@cloudera.com<mailto:d...@cloudera.com>> wrote:


After further thought, I think following both of your suggestions- adding a 
shutdown hook and making the threads non-daemon- may have the result I'm 
looking for.  I'll check and see if there are other reasons not to use daemon 
threads in our networking internals.  More generally though, what do y'all 
think about having Spark shutdown or close RelationProviders once they are not 
needed?  Seems to me that RelationProviders will often be stateful objects with 
network and/or file resources.  I checked with the C* Spark connector, and they 
jump through a bunch of hoops to handle this issue, including shutdown hooks 
and a ref counted cache.

I'd recommend using org.apache.spark.util.ShutdownHookManager as the shutdown 
hook mechanism; it gives you priority over shutdown , and is already used in 
the Yarn AM, DiskBlockManager and elsewhere


One thing to be careful about in shutdown hooks is to shut down in a bounded 
time period even if you can't connect to the far end: do make sure there are 
timeouts on TCP connects &c. i've hit problems with Hadoop HDFS where, if the 
endpoint isn't configured correctly, the shutdown hook blocks, causing 
Control-C/kill <pid> interrupts to appear to hang, and of course a second kill 
just deadlocks on the original sync. (To deal with that, I ended up recognising 
a 2nd Ctrl-C interrupt as a trigger for calling System.halt(), which bails out 
the JVM without trying to invoke those hooks


- Dan

On Wed, Mar 16, 2016 at 4:04 PM, Dan Burkert 
<d...@cloudera.com<mailto:d...@cloudera.com>> wrote:
Thanks for the replies, responses inline:

On Wed, Mar 16, 2016 at 3:36 PM, Reynold Xin 
<r...@databricks.com<mailto:r...@databricks.com>> wrote:
There is no way to really know that, because users might run queries at any 
given point.

BTW why can't your threads be just daemon threads?

The bigger issue is that we require the Kudu client to be manually closed so 
that it can do necessary cleanup tasks.  During shutdown the client closes the 
non-daemon threads, but more importantly, it flushes any outstanding batched 
writes to the server.

On Wed, Mar 16, 2016 at 3:35 PM, Hamel Kothari 
<hamelkoth...@gmail.com<mailto:hamelkoth...@gmail.com>> wrote:
Dan,

You could probably just register a JVM shutdown hook yourself: 
https://docs.oracle.com/javase/7/docs/api/java/lang/Runtime.html#addShutdownHook(java.lang.Thread)

This at least would let you close the connections when the application as a 
whole has completed (in standalone) or when your executors have been killed (in 
YARN). I think that's as close as you'll get to knowing when an executor will 
no longer have any tasks in the current state of the world.

The Spark shell will not run shutdown hooks after a <ctrl>-D if there are 
non-daemon threads running.  You can test this with the following input to the 
shell:

new Thread(new Runnable { override def run() = { while (true) { 
println("running"); Thread.sleep(10000) } } }).start()
Runtime.getRuntime.addShutdownHook(new Thread(new Runnable { override def run() 
= println("shutdown fired") }))

- Dan



On Wed, Mar 16, 2016 at 3:29 PM, Dan Burkert 
<d...@cloudera.com<mailto:d...@cloudera.com>> wrote:
Hi Reynold,

Is there any way to know when an executor will no longer have any tasks?  It 
seems to me there is no timeout which is appropriate that is long enough to 
ensure that no more tasks will be scheduled on the executor, and short enough 
to be appropriate to wait on during an interactive shell shutdown.

- Dan

On Wed, Mar 16, 2016 at 2:40 PM, Reynold Xin 
<r...@databricks.com<mailto:r...@databricks.com>> wrote:
Maybe just add a watch dog thread and closed the connection upon some timeout?


On Wednesday, March 16, 2016, Dan Burkert 
<d...@cloudera.com<mailto:d...@cloudera.com>> wrote:
Hi all,

I'm working on the Spark connector for Apache Kudu, and I've run into an issue 
that is a bit beyond my Spark knowledge. The Kudu connector internally holds an 
open connection to the Kudu 
cluster<https://github.com/apache/incubator-kudu/blob/master/java/kudu-spark/src/main/scala/org/kududb/spark/KuduContext.scala#L37>
 which internally holds a Netty context with non-daemon threads. When using the 
Spark shell with the Kudu connector, exiting the shell via <ctrl>-D causes the 
shell to hang, and a thread dump reveals it's waiting for these non-daemon 
threads.  Registering a JVM shutdown hook to close the Kudu client does not do 
the trick, as it seems that the shutdown hooks are not fired on <ctrl>-D.

I see that there is an internal Spark API for handling 
shutdown<https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/ShutdownHookManager.scala>,
 is there something similar available for cleaning up external data sources?

- Dan

Re: graceful shutdown in external data sources

Reply via email to