Tathagata Das wrote
> *Answer 1:*Make sure you are using master as "local[n]" with n > 1
> (assuming you are running it in local mode). The way Spark Streaming
> works is that it assigns a code to the data receiver, and so if you
> run the program with only one core (i.e., with local or local[1]),
foreachRDD is the underlying function that is used behind print and
saveAsTextFiles. If you are curious, here is the actual source
https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/dstream/DStream.scala#L759
Yes, since the RDD is used twice, its proba
Thanks, Tagatha, very helpful. A follow-up question below...
On Wed, Mar 26, 2014 at 2:27 PM, Tathagata Das
wrote:
>
>
> *Answer 3:*You can do something like
> wordCounts.foreachRDD((rdd: RDD[...], time: Time) => {
>if (rdd.take(1).size == 1) {
> // There exists at least one element i
Hi Diana,
I'll answer Q3:
You can check if an RDD is empty in several ways.
Someone here mentioned that using an iterator was safer:
val isEmpty = rdd.mapPartitions(iter => Iterator(! iter.hasNext)).reduce(_&&_)
You can also check with a fold or rdd.count
rdd.reduce(_ + _) // can't handle em
*Answer 1:*Make sure you are using master as "local[n]" with n > 1
(assuming you are running it in local mode). The way Spark Streaming
works is that it assigns a code to the data receiver, and so if you
run the program with only one core (i.e., with local or local[1]),
then it wont have resources