Hi All,
I am facing a problem when using TTL with TorrentBroadcastFactory in
Spark-1.2.0.
My code is as follows:
val conf = new SparkConf().
setAppName("TTL_Broadcast_vars").
setMaster("local").
//set("spark.broadcast.factory",
"org.apache.spark.broadcast.HttpBroadcastFactory").
set("spark.cleaner.ttl", "2")
val sc = new SparkContext(conf)
val data = "TTL_Broadcast_vars"
val bData = sc.broadcast(data)
sc.parallelize(1 to 3, 3).map(v => {
Thread.sleep(4 * 1000)
bData.value
}).collect().foreach(println)
I got the following error message:
java.io.IOException: org.apache.spark.SparkException: Failed to get
broadcast_0_piece0 of broadcast_0
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1003)
at
org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
at
org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
at
org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
at TTL_Broadcast_vars$$anonfun$main$1.apply(TTL_Broadcast_vars.scala:17)
at TTL_Broadcast_vars$$anonfun$main$1.apply(TTL_Broadcast_vars.scala:15)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at
scala.collection.TraversableOnce$class.to<http://class.to/>(TraversableOnce.scala:273)
at
scala.collection.AbstractIterator.to<http://scala.collection.abstractiterator.to/>(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.rdd.RDD$$anonfun$16.apply(RDD.scala:780)
at org.apache.spark.rdd.RDD$$anonfun$16.apply(RDD.scala:780)
at
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314)
at
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
But if i use HttpBroadcastFactory, it can run successfully.
I'm wondering whether it is a feature of TorrentBroadcastFactory or a bug?
Mars Gu