Re: graceful shutdown in external data sources

Dan Burkert Sat, 19 Mar 2016 17:02:46 -0700

Hi Steve,

I referenced the ShutdownHookManager in my original message, but it appears
to be an internal-only API.  Looks like it uses a Hadoop equivalent
internally, though, so I'll look into using that.  Good tip about timeouts,
thanks.


 - Dan

On Thu, Mar 17, 2016 at 5:02 AM, Steve Loughran <ste...@hortonworks.com>
wrote:

>
> On 16 Mar 2016, at 23:43, Dan Burkert <d...@cloudera.com> wrote:
>
> After further thought, I think following both of your suggestions- adding
> a shutdown hook and making the threads non-daemon- may have the result I'm
> looking for.  I'll check and see if there are other reasons not to use
> daemon threads in our networking internals.  More generally though, what do
> y'all think about having Spark shutdown or close RelationProviders once
> they are not needed?  Seems to me that RelationProviders will often be
> stateful objects with network and/or file resources.  I checked with the C*
> Spark connector, and they jump through a bunch of hoops to handle this
> issue, including shutdown hooks and a ref counted cache.
>
>
> I'd recommend using org.apache.spark.util.ShutdownHookManager as the
> shutdown hook mechanism; it gives you priority over shutdown , and is
> already used in the Yarn AM, DiskBlockManager and elsewhere
>
>
> One thing to be careful about in shutdown hooks is to shut down in a
> bounded time period even if you can't connect to the far end: do make sure
> there are timeouts on TCP connects &c. i've hit problems with Hadoop HDFS
> where, if the endpoint isn't configured correctly, the shutdown hook
> blocks, causing Control-C/kill <pid> interrupts to appear to hang, and of
> course a second kill just deadlocks on the original sync. (To deal with
> that, I ended up recognising a 2nd Ctrl-C interrupt as a trigger for
> calling System.halt(), which bails out the JVM without trying to invoke
> those hooks
>
>
> - Dan
>
> On Wed, Mar 16, 2016 at 4:04 PM, Dan Burkert <d...@cloudera.com> wrote:
>
>> Thanks for the replies, responses inline:
>>
>> On Wed, Mar 16, 2016 at 3:36 PM, Reynold Xin <r...@databricks.com> wrote:
>>
>>> There is no way to really know that, because users might run queries at
>>> any given point.
>>>
>>> BTW why can't your threads be just daemon threads?
>>>
>>
>> The bigger issue is that we require the Kudu client to be manually closed
>> so that it can do necessary cleanup tasks.  During shutdown the client
>> closes the non-daemon threads, but more importantly, it flushes any
>> outstanding batched writes to the server.
>>
>> On Wed, Mar 16, 2016 at 3:35 PM, Hamel Kothari <hamelkoth...@gmail.com>
>>  wrote:
>>
>>> Dan,
>>>
>>> You could probably just register a JVM shutdown hook yourself:
>>> https://docs.oracle.com/javase/7/docs/api/java/lang/Runtime.html#addShutdownHook(java.lang.Thread)
>>>
>>>
>>> This at least would let you close the connections when the application
>>> as a whole has completed (in standalone) or when your executors have been
>>> killed (in YARN). I think that's as close as you'll get to knowing when an
>>> executor will no longer have any tasks in the current state of the world.
>>>
>>
>> The Spark shell will not run shutdown hooks after a <ctrl>-D if there are
>> non-daemon threads running.  You can test this with the following input to
>> the shell:
>>
>> new Thread(new Runnable { override def run() = { while (true) {
>> println("running"); Thread.sleep(10000) } } }).start()
>> Runtime.getRuntime.addShutdownHook(new Thread(new Runnable { override def
>> run() = println("shutdown fired") }))
>>
>> - Dan
>>
>>
>>
>>>
>>> On Wed, Mar 16, 2016 at 3:29 PM, Dan Burkert <d...@cloudera.com> wrote:
>>>
>>>> Hi Reynold,
>>>>
>>>> Is there any way to know when an executor will no longer have any
>>>> tasks?  It seems to me there is no timeout which is appropriate that is
>>>> long enough to ensure that no more tasks will be scheduled on the executor,
>>>> and short enough to be appropriate to wait on during an interactive shell
>>>> shutdown.
>>>>
>>>> - Dan
>>>>
>>>> On Wed, Mar 16, 2016 at 2:40 PM, Reynold Xin <r...@databricks.com>
>>>> wrote:
>>>>
>>>>> Maybe just add a watch dog thread and closed the connection upon some
>>>>> timeout?
>>>>>
>>>>>
>>>>> On Wednesday, March 16, 2016, Dan Burkert <d...@cloudera.com> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I'm working on the Spark connector for Apache Kudu, and I've run into
>>>>>> an issue that is a bit beyond my Spark knowledge. The Kudu connector
>>>>>> internally holds an open connection to the Kudu cluster
>>>>>> <https://github.com/apache/incubator-kudu/blob/master/java/kudu-spark/src/main/scala/org/kududb/spark/KuduContext.scala#L37>
>>>>>>  which
>>>>>> internally holds a Netty context with non-daemon threads. When using the
>>>>>> Spark shell with the Kudu connector, exiting the shell via <ctrl>-D 
>>>>>> causes
>>>>>> the shell to hang, and a thread dump reveals it's waiting for these
>>>>>> non-daemon threads.  Registering a JVM shutdown hook to close the Kudu
>>>>>> client does not do the trick, as it seems that the shutdown hooks are not
>>>>>> fired on <ctrl>-D.
>>>>>>
>>>>>> I see that there is an internal Spark API for handling shutdown
>>>>>> <https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/ShutdownHookManager.scala>,
>>>>>> is there something similar available for cleaning up external data 
>>>>>> sources?
>>>>>>
>>>>>> - Dan
>>>>>>
>>>>>
>>>>
>>>
>>
>
>

Re: graceful shutdown in external data sources

Reply via email to