You are quite right. I am getting this error while profiling my module to
see what is the minimum resources I can use to achieve my SLA.
My point is that if resource constraint creates this problem, then this
issue is just waiting to happen in a larger scenario(Though the probability
of happening will be less)

I hope to get some guidance as to what parameter I can use in order to
totally avoid this issue.
I am guessing spark.shuffle.io.preferDirectBufs = false but I am not sure.

..Manas

On Tue, Mar 15, 2016 at 2:30 PM, Iain Cundy <iain.cu...@amdocs.com> wrote:

> Hi Manas
>
>
>
> I saw a very similar problem while using mapWithState. Timeout on
> BlockManager remove leading to a stall.
>
>
>
> In my case it only occurred when there was a big backlog of micro-batches,
> combined with a shortage of memory. The adding and removing of blocks
> between new and old tasks was interleaved.  Don’t really know what caused
> it. Once I fixed the problems that were causing the backlog – in my case
> state compaction not working with Kryo in 1.6.0 (with Kryo workaround
> rather than patch) – I’ve never seen it again.
>
>
>
> So if you’ve got a backlog or other issue to fix maybe you’ll get lucky
> too J.
>
>
>
> Cheers
>
> Iain
>
>
>
> *From:* manas kar [mailto:poorinsp...@gmail.com]
> *Sent:* 15 March 2016 14:49
> *To:* Ted Yu
> *Cc:* user
> *Subject:* [MARKETING] Re: mapwithstate Hangs with Error cleaning
> broadcast
>
>
>
> I am using spark 1.6.
>
> I am not using any broadcast variable.
>
> This broadcast variable is probably used by the state management of
> mapwithState
>
>
>
> ...Manas
>
>
>
> On Tue, Mar 15, 2016 at 10:40 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>
> Which version of Spark are you using ?
>
>
>
> Can you show the code snippet w.r.t. broadcast variable ?
>
>
>
> Thanks
>
>
>
> On Tue, Mar 15, 2016 at 6:04 AM, manasdebashiskar <poorinsp...@gmail.com>
> wrote:
>
> Hi,
>  I have a streaming application that takes data from a kafka topic and uses
> mapwithstate.
>  After couple of hours of smooth running of the application I see a problem
> that seems to have stalled my application.
> The batch seems to have been stuck after the following error popped up.
> Has anyone seen this error or know what causes it?
> 14/03/2016 21:41:13,295 ERROR org.apache.spark.ContextCleaner: 95 - Error
> cleaning broadcast 7456
> org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120
> seconds]. This timeout is controlled by spark.rpc.askTimeout
>         at
> org.apache.spark.rpc.RpcTimeout.org
> $apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
>         at
>
> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
>         at
>
> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
>         at
>
> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
>         at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
>         at
>
> org.apache.spark.storage.BlockManagerMaster.removeBroadcast(BlockManagerMaster.scala:136)
>         at
>
> org.apache.spark.broadcast.TorrentBroadcast$.unpersist(TorrentBroadcast.scala:228)
>         at
>
> org.apache.spark.broadcast.TorrentBroadcastFactory.unbroadcast(TorrentBroadcastFactory.scala:45)
>         at
>
> org.apache.spark.broadcast.BroadcastManager.unbroadcast(BroadcastManager.scala:67)
>         at
>
> org.apache.spark.ContextCleaner.doCleanupBroadcast(ContextCleaner.scala:233)
>         at
>
> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$2.apply(ContextCleaner.scala:189)
>         at
>
> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$2.apply(ContextCleaner.scala:180)
>         at scala.Option.foreach(Option.scala:236)
>         at
>
> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:180)
>         at
> org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1180)
>         at
> org.apache.spark.ContextCleaner.org
> $apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:173)
>         at
> org.apache.spark.ContextCleaner$$anon$3.run(ContextCleaner.scala:68)
> Caused by: java.util.concurrent.TimeoutException: Futures timed out after
> [120 seconds]
>         at
> scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
>         at
> scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
>         at
> scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
>         at
>
> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
>         at scala.concurrent.Await$.result(package.scala:107)
>         at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
>         ... 12 more
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/mapwithstate-Hangs-with-Error-cleaning-broadcast-tp26500.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>
>
>
> This message and the information contained herein is proprietary and
> confidential and subject to the Amdocs policy statement, you may review at
> http://www.amdocs.com/email_disclaimer.asp
>

Reply via email to