Average of each RDD in Stream

2014-05-15 Thread Laeeq Ahmed
Hi, I use the following code for calculating average. The problem is that the reduce operation return a DStream here and not a tuple as it normally does without Streaming. So how can we get the sum and the count from the DStream. Can we cast it to tuple? val numbers = ssc.textFileStream(args(

Average of each RDD in Stream

2014-05-14 Thread Laeeq Ahmed
Hi, I use the following code for calculating average. The problem is that the reduce operation return a DStream here and not a tuple as it normally does without Streaming. So how can we get the sum and the count from the DStream. Can we cast it to tuple?     val numbers = ssc.textFileStream(a

Re: Average of each RDD in Stream

2014-05-12 Thread Tathagata Das
Use DStream.foreachRDD to do an operation on the final RDD of every batch. val sumandcount = numbers.map(n => (n.toDouble, 1)).reduce{ (a, b) => (a._1 + b._1, a._2 + b._2) } sumandcount.foreachRDD { rdd => val first: (Double, Int) = rdd.take(1) ; ... } DStream.reduce creates DStream whose RDDs h

Average of each RDD in Stream

2014-05-12 Thread Laeeq Ahmed
Hi, I use the following code for calculating average. The problem is that the reduce operation return a DStream here and not a tuple as it normally does without Streaming. So how can we get the sum and the count from the DStream. Can we cast it to tuple? val numbers = ssc.textFileStream(args(

Re: Average of each RDD in Stream

2014-05-12 Thread Sean Owen
You mean you normally get an RDD, right? A DStream is a sequence of RDDs. It kind of depends on what you are trying to accomplish here? sum/count for each RDD in the stream? On Wed, May 7, 2014 at 6:43 PM, Laeeq Ahmed wrote: > Hi, > > I use the following code for calculating average. The problem