Hi all,

I'm working on the Spark connector for Apache Kudu, and I've run into an
issue that is a bit beyond my Spark knowledge. The Kudu connector
internally holds an open connection to the Kudu cluster
<https://github.com/apache/incubator-kudu/blob/master/java/kudu-spark/src/main/scala/org/kududb/spark/KuduContext.scala#L37>
which
internally holds a Netty context with non-daemon threads. When using the
Spark shell with the Kudu connector, exiting the shell via <ctrl>-D causes
the shell to hang, and a thread dump reveals it's waiting for these
non-daemon threads.  Registering a JVM shutdown hook to close the Kudu
client does not do the trick, as it seems that the shutdown hooks are not
fired on <ctrl>-D.

I see that there is an internal Spark API for handling shutdown
<https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/ShutdownHookManager.scala>,
is there something similar available for cleaning up external data sources?

- Dan

Reply via email to