I should also add I've recently seen this issue as well when using collect.
I believe in my case it was related to heap space on the driver program not
being able to handle the returned collection.

On Thu, May 7, 2015 at 11:05 AM, Richard Marscher <rmarsc...@localytics.com>
wrote:

> By default you would expect to find the logs files for master and workers
> in the relative `logs` directory from the root of the Spark installation on
> each of the respective nodes in the cluster.
>
> On Thu, May 7, 2015 at 10:27 AM, Wang, Ningjun (LNG-NPV) <
> ningjun.w...@lexisnexis.com> wrote:
>
>>  Ø  Can you check your local and remote logs?
>>
>>
>>
>> Where are the log files? I see the following in my Driver program logs as
>> well as the Spark UI failed task page
>>
>>
>>
>> java.io.IOException: org.apache.spark.SparkException: Failed to get
>> broadcast_2_piece0 of broadcast_2
>>
>>
>>
>> Here is the detailed stack trace.
>>
>> 15/05/06 10:48:51 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,
>> LAB4-WIN03.pcc.lexisnexis.com): java.io.IOException: 
>> org.apache.spark.SparkException:
>> Failed to get broadcast_2_piece0 of broadcast_2
>>
>>         at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1156)
>>
>>         at
>> org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
>>
>>         at
>> org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
>>
>>         at
>> org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
>>
>>         at
>> org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
>>
>>         at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
>>
>>         at
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:61)
>>
>>         at
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>>
>>         at org.apache.spark.scheduler.Task.run(Task.scala:64)
>>
>>         at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
>>
>>         at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>
>>         at java.lang.Thread.run(Thread.java:745)
>>
>>
>>
>>
>>
>>
>>
>> Ningjun
>>
>>
>>
>> *From:* Jonathan Coveney [mailto:jcove...@gmail.com]
>> *Sent:* Wednesday, May 06, 2015 5:23 PM
>> *To:* Wang, Ningjun (LNG-NPV)
>> *Cc:* Ted Yu; user@spark.apache.org
>>
>> *Subject:* Re: java.io.IOException: org.apache.spark.SparkException:
>> Failed to get broadcast_2_piece0
>>
>>
>>
>> Can you check your local and remote logs?
>>
>>
>>
>> 2015-05-06 16:24 GMT-04:00 Wang, Ningjun (LNG-NPV) <
>> ningjun.w...@lexisnexis.com>:
>>
>> This problem happen in Spark 1.3.1.  It happen when two jobs are running
>> simultaneously each in its own Spark Context.
>>
>>
>>
>> I don’t remember seeing this bug in Spark 1.2.0. Is it a new bug
>> introduced in Spark 1.3.1?
>>
>>
>>
>> Ningjun
>>
>>
>>
>> *From:* Ted Yu [mailto:yuzhih...@gmail.com]
>> *Sent:* Wednesday, May 06, 2015 11:32 AM
>> *To:* Wang, Ningjun (LNG-NPV)
>> *Cc:* user@spark.apache.org
>> *Subject:* Re: java.io.IOException: org.apache.spark.SparkException:
>> Failed to get broadcast_2_piece0
>>
>>
>>
>> Which release of Spark are you using ?
>>
>>
>>
>> Thanks
>>
>>
>> On May 6, 2015, at 8:03 AM, Wang, Ningjun (LNG-NPV) <
>> ningjun.w...@lexisnexis.com> wrote:
>>
>>  I run a job on spark standalone cluster and got the exception below
>>
>>
>>
>> Here is the line of code that cause problem
>>
>>
>>
>> *val *myRdd: RDD[(String, String, String)] = … *// RDD of (docid,
>> cattegory, path) *
>>
>>
>> myRdd.persist(StorageLevel.*MEMORY_AND_DISK_SER*)
>>
>> *val *cats: Array[String] = myRdd.map(t => t._2).distinct().collect()
>> // This line cause the exception
>>
>>
>>
>>
>>
>> 15/05/06 10:48:51 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,
>> LAB4-WIN03.pcc.lexisnexis.com): java.io.IOException: 
>> org.apache.spark.SparkException:
>> Failed to get broadcast_2_piece0 of broadcast_2
>>
>>         at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1156)
>>
>>         at
>> org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
>>
>>         at
>> org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
>>
>>         at
>> org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
>>
>>         at
>> org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
>>
>>         at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
>>
>>         at
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:61)
>>
>>         at
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>>
>>         at org.apache.spark.scheduler.Task.run(Task.scala:64)
>>
>>         at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
>>
>>         at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>
>>         at java.lang.Thread.run(Thread.java:745)
>>
>> Caused by: org.apache.spark.SparkException: Failed to get
>> broadcast_2_piece0 of broadcast_2
>>
>>         at
>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBr
>>
>> oadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:137)
>>
>>         at
>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBr
>>
>> oadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:137)
>>
>>         at scala.Option.getOrElse(Option.scala:120)
>>
>>         at
>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBr
>>
>> oadcast$$readBlocks$1.apply$mcVI$sp(TorrentBroadcast.scala:136)
>>
>>         at
>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBr
>>
>> oadcast$$readBlocks$1.apply(TorrentBroadcast.scala:119)
>>
>>         at
>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBr
>>
>> oadcast$$readBlocks$1.apply(TorrentBroadcast.scala:119)
>>
>>         at scala.collection.immutable.List.foreach(List.scala:318)
>>
>>         at org.apache.spark.broadcast.TorrentBroadcast.org
>> $apache$spark$broadcast$TorrentBroadcast$$
>>
>> readBlocks(TorrentBroadcast.scala:119)
>>
>>         at
>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBr
>>
>> oadcast.scala:174)
>>
>>         at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1153)
>>
>>         ... 12 more
>>
>>
>>
>>
>>
>> Any idea what cause the problem and how to avoid it?
>>
>>
>>
>> Thanks
>>
>> Ningjun
>>
>>
>>
>>
>>
>
>

Reply via email to