Hi spark community,

I was hoping someone could help me by running a code snippet below in the
spark shell, and seeing if they see the same buggy behavior I see. Full
details of the bug can be found in this JIRA issue I filed:
https://issues.apache.org/jira/browse/SPARK-10942.

The issue was closed due to cannot reproduce, however, I can't seem to shake
it. I have worked on this for a while, removing all known variables, and
trying different versions of spark (1.5.0, 1.5.1, master), and different OSs
(Mac OSX, Debian Linux). My coworkers have tried as well and see the same
behavior. This has me convinced that I cannot be the only one in the
community to be able to produce this.

If you have a minute or two, please open a spark shell and copy/paste the
below code. After 30 seconds, check the spark ui, storage tab. If you see
some cached RDDs listed, then the bug has been reproduced. If not, then
there is no bug... and I may be losing my mind.

Thanks in advance!

Nick


------------


import org.apache.spark.streaming.{Seconds, StreamingContext}
import scala.collection.mutable

val ssc = new StreamingContext(sc, Seconds(1))

val inputRDDs = mutable.Queue.tabulate(30) { i =>
  sc.parallelize(Seq(i))
}

val input = ssc.queueStream(inputRDDs)

val output = input.transform { rdd =>
  if (rdd.isEmpty()) {
    rdd
  } else {
    val rdd2 = rdd.map(identity)
    rdd2.cache()
    rdd2.setName(rdd.first().toString)
    val rdd3 = rdd2.map(identity) ++ rdd2.map(identity)
    rdd3
  }
}

output.print()

ssc.start()





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Help-needed-to-reproduce-bug-tp24965.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to