Re: Time based aggregation in Real time Spark Streaming

2014-12-01 Thread pankaj
Hi , suppose i keep batch size of 3 minute. in 1 batch there can be incoming records with any time stamp. so it is difficult to keep track of when the 3 minute interval was start and end. i am doing output operation on worker nodes in forEachPartition not in drivers(forEachRdd) so i cannot use any

Re: Time based aggregation in Real time Spark Streaming

2014-12-01 Thread Bahubali Jain
Hi, You can associate all the messages of a 3min interval with a unique key and then group by and finally add up. Thanks On Dec 1, 2014 9:02 PM, "pankaj" wrote: > Hi, > > My incoming message has time stamp as one field and i have to perform > aggregation over 3 minute of time slice. > > Message