Thanx
Could you elaborate on writing to all partitions and not just one pls?
How can I make sure ?
I see all partitions consumed in the dashboard and they get listed when my Beam
app starts and KafkaIO read operation gets associated to every single partition
What else ?
Thanks so much again
Sent
Hmm, this sound like it could be IDE/Windows specific, unfortunately I
don't have access to a windows machine. I'll loop in Chesnay how is using
windows.
Chesnay, do you maybe have an idea what could be the problem? Have you ever
encountered this?
On Sat, 17 Sep 2016 at 15:30 Yassine MARZOUGUI
w
Hi,
good to see that you're making progress! The number of partitions in the
Kafka topic should be >= the number of parallel Flink Slots and the
parallelism with which you start the program. You also have to make sure to
write to all partitions and not just to one.
Cheers,
Aljoscha
On Sun, 18 Sep
Hi Aljoscha,Thanks for your kind response.- We are really benchmarking Beam &
its Runners and it happened that we started with Flink.therefore, any change we
make to the approach must be a Beam code change that automatically affects the
underlying runner.- I changed the TextIO() back to KafkaIO(
Here is the code snippet:
windowedStream.fold(TreeMultimap.create(), new
FoldFunction, TreeMultimap>() {
@Override
public TreeMultimap fold(TreeMultimap topKSoFar,
Tuple2
itemCount) throws Exception {
String item = itemCount.f0;
Lo
This is not related to Flink, but in Beam you can read from a directory
containing many files using something like this (from MinimalWordCount.java
in Beam):
TextIO.Read.from("gs://apache-beam-samples/shakespeare/*")
This will read all the files in the directory in parallel.
For reading from Kaf