Re: graceful shutdown in external data sources

2016-03-20 Thread Hamel Kothari
Dan, You could probably just register a JVM shutdown hook yourself: https://docs.oracle.com/javase/7/docs/api/java/lang/Runtime.html#addShutdownHook(java.lang.Thread ) This at least would let you close the connections when the application as a whole has completed (in standalone) or when your exec

Re: graceful shutdown in external data sources

2016-03-20 Thread Dan Burkert
After further thought, I think following both of your suggestions- adding a shutdown hook and making the threads non-daemon- may have the result I'm looking for. I'll check and see if there are other reasons not to use daemon threads in our networking internals. More generally though, what do y'a

Re: graceful shutdown in external data sources

2016-03-19 Thread Dan Burkert
Hi Reynold, Is there any way to know when an executor will no longer have any tasks? It seems to me there is no timeout which is appropriate that is long enough to ensure that no more tasks will be scheduled on the executor, and short enough to be appropriate to wait on during an interactive shell

Re: graceful shutdown in external data sources

2016-03-19 Thread Reynold Xin
There is no way to really know that, because users might run queries at any given point. BTW why can't your threads be just daemon threads? On Wed, Mar 16, 2016 at 3:29 PM, Dan Burkert wrote: > Hi Reynold, > > Is there any way to know when an executor will no longer have any tasks? > It seems

Re: graceful shutdown in external data sources

2016-03-19 Thread Dan Burkert
Hi Steve, I referenced the ShutdownHookManager in my original message, but it appears to be an internal-only API. Looks like it uses a Hadoop equivalent internally, though, so I'll look into using that. Good tip about timeouts, thanks. - Dan On Thu, Mar 17, 2016 at 5:02 AM, Steve Loughran wr

Re: graceful shutdown in external data sources

2016-03-19 Thread Reynold Xin
Maybe just add a watch dog thread and closed the connection upon some timeout? On Wednesday, March 16, 2016, Dan Burkert wrote: > Hi all, > > I'm working on the Spark connector for Apache Kudu, and I've run into an > issue that is a bit beyond my Spark knowledge. The Kudu connector > internally

Re: graceful shutdown in external data sources

2016-03-19 Thread Dan Burkert
Thanks for the replies, responses inline: On Wed, Mar 16, 2016 at 3:36 PM, Reynold Xin wrote: > There is no way to really know that, because users might run queries at > any given point. > > BTW why can't your threads be just daemon threads? > The bigger issue is that we require the Kudu client

graceful shutdown in external data sources

2016-03-19 Thread Dan Burkert
Hi all, I'm working on the Spark connector for Apache Kudu, and I've run into an issue that is a bit beyond my Spark knowledge. The Kudu connector internally holds an open connection to the Kudu cluster

Re: graceful shutdown in external data sources

2016-03-19 Thread Steve Loughran
On 17 Mar 2016, at 17:46, Dan Burkert mailto:d...@cloudera.com>> wrote: Looks like it uses a Hadoop equivalent internally, though, so I'll look into using that. Good tip about timeouts, thanks. Dont think that's actually tagged as @Public, but it would upset too many people if it broke, my

Re: graceful shutdown in external data sources

2016-03-18 Thread Steve Loughran
On 16 Mar 2016, at 23:43, Dan Burkert mailto:d...@cloudera.com>> wrote: After further thought, I think following both of your suggestions- adding a shutdown hook and making the threads non-daemon- may have the result I'm looking for. I'll check and see if there are other reasons not to use da