For windows that large (1 hour), you will probably also have to
increase the batch interval for efficiency.
TD
On Mon, Dec 29, 2014 at 12:16 AM, Akhil Das wrote:
> You can use reduceByKeyAndWindow for that. Here's a pretty clean example
> https://github.com/apache/spark/blob/master/examples/src/
You can use reduceByKeyAndWindow for that. Here's a pretty clean example
https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/TwitterPopularTags.scala
Thanks
Best Regards
On Mon, Dec 29, 2014 at 1:30 PM, Hoai-Thu Vuong wrote:
> dear user of spa
dear user of spark
I've got a program, streaming a folder, when a new file is created in this
folder, I count a word, which appears in this document and update it (I
used StatefulNetworkWordCount to do it). And it work like charm. However, I
would like to know the different of top 10 word at now a