Re: counting words (not frequency)

2016-07-22 Thread Sameer Wadkar
It is complicated: 1. If you have a file you should consider using the DataSet API. It is more complicated to use DataStream with files as you have to simulate a stream from a file. 2. You need a tokenizer for a map operator unless you have a word per line. 3. Sum operator is fine it will count

Re: counting words (not frequency)

2016-07-22 Thread Roshan Naik
Seems a bit convoluted for such a simple problem. I am thinking a custom streaming count() operator will simplify. Wasn¹t able to find examples for custom Streaming operators. -roshan On 7/21/16, 8:00 PM, "hrajaram" wrote: >Can't you use a KeyedStream, I mean keyBy with the sameKey? something

Re: counting words (not frequency)

2016-07-21 Thread hrajaram
some other good choices but this is the first thing that quickly came in my mind :-) Hari -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/counting-words-not-frequency-tp8099p8100.html Sent from the Apache Flink User Mailing List archive.

counting words (not frequency)

2016-07-21 Thread Roshan Naik
Was trying to write a simple streaming Flink program that counts the total words(not the frequency) in a fie. I was thinking on the lines of : counts = text.flatMap(new Tokenizer()) .count(); // count() isnt part of streamin APIs (but supported for batching) Any suggestions on how to do this