Dan, You could probably just register a JVM shutdown hook yourself: https://docs.oracle.com/javase/7/docs/api/java/lang/Runtime.html#addShutdownHook(java.lang.Thread )
This at least would let you close the connections when the application as a whole has completed (in standalone) or when your executors have been killed (in YARN). I think that's as close as you'll get to knowing when an executor will no longer have any tasks in the current state of the world. On Wed, Mar 16, 2016 at 6:30 PM Dan Burkert <d...@cloudera.com> wrote: > Hi Reynold, > > Is there any way to know when an executor will no longer have any tasks? > It seems to me there is no timeout which is appropriate that is long enough > to ensure that no more tasks will be scheduled on the executor, and short > enough to be appropriate to wait on during an interactive shell shutdown. > > - Dan > > On Wed, Mar 16, 2016 at 2:40 PM, Reynold Xin <r...@databricks.com> wrote: > >> Maybe just add a watch dog thread and closed the connection upon some >> timeout? >> >> >> On Wednesday, March 16, 2016, Dan Burkert <d...@cloudera.com> wrote: >> >>> Hi all, >>> >>> I'm working on the Spark connector for Apache Kudu, and I've run into an >>> issue that is a bit beyond my Spark knowledge. The Kudu connector >>> internally holds an open connection to the Kudu cluster >>> <https://github.com/apache/incubator-kudu/blob/master/java/kudu-spark/src/main/scala/org/kududb/spark/KuduContext.scala#L37> >>> which >>> internally holds a Netty context with non-daemon threads. When using the >>> Spark shell with the Kudu connector, exiting the shell via <ctrl>-D causes >>> the shell to hang, and a thread dump reveals it's waiting for these >>> non-daemon threads. Registering a JVM shutdown hook to close the Kudu >>> client does not do the trick, as it seems that the shutdown hooks are not >>> fired on <ctrl>-D. >>> >>> I see that there is an internal Spark API for handling shutdown >>> <https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/ShutdownHookManager.scala>, >>> is there something similar available for cleaning up external data sources? >>> >>> - Dan >>> >> >