Re: Flink Cluster Load Distribution Question

2016-09-18 Thread Amir Bahmanyari
Thanx Could you elaborate on writing to all partitions and not just one pls? How can I make sure ? I see all partitions consumed in the dashboard and they get listed when my Beam app starts and KafkaIO read operation gets associated to every single partition What else ? Thanks so much again Sent

Re: Simple batch job hangs if run twice

2016-09-18 Thread Aljoscha Krettek
Hmm, this sound like it could be IDE/Windows specific, unfortunately I don't have access to a windows machine. I'll loop in Chesnay how is using windows. Chesnay, do you maybe have an idea what could be the problem? Have you ever encountered this? On Sat, 17 Sep 2016 at 15:30 Yassine MARZOUGUI w

Re: Flink Cluster Load Distribution Question

2016-09-18 Thread Aljoscha Krettek
Hi, good to see that you're making progress! The number of partitions in the Kafka topic should be >= the number of parallel Flink Slots and the parallelism with which you start the program. You also have to make sure to write to all partitions and not just to one. Cheers, Aljoscha On Sun, 18 Sep

Re: Flink Cluster Load Distribution Question

2016-09-18 Thread amir bahmanyari
Hi Aljoscha,Thanks for your kind response.- We are really benchmarking Beam & its Runners and it happened that we started with Flink.therefore, any change we make to the approach must be a Beam code change that automatically affects the underlying runner.- I changed the TextIO() back to KafkaIO(

Serialization problem for Guava's TreeMultimap

2016-09-18 Thread Yukun Guo
Here is the code snippet: windowedStream.fold(TreeMultimap.create(), new FoldFunction, TreeMultimap>() { @Override public TreeMultimap fold(TreeMultimap topKSoFar, Tuple2 itemCount) throws Exception { String item = itemCount.f0; Lo

Re: Flink Cluster Load Distribution Question

2016-09-18 Thread Aljoscha Krettek
This is not related to Flink, but in Beam you can read from a directory containing many files using something like this (from MinimalWordCount.java in Beam): TextIO.Read.from("gs://apache-beam-samples/shakespeare/*") This will read all the files in the directory in parallel. For reading from Kaf