The KafkaSink is the last step in my program after the 2nd deduplication.
I could not yet track down where duplicates show up. That's a bit difficult to
find out... But I'm trying to find it...
> Am 03.09.2015 um 14:14 schrieb Stephan Ewen :
>
> Can you tell us where the KafkaSink comes into
Can you tell us where the KafkaSink comes into play? At what point do the
duplicates come up?
On Thu, Sep 3, 2015 at 2:09 PM, Rico Bergmann wrote:
> No. I mean the KafkaSink.
>
> A bit more insight to my program: I read from a Kafka topic with
> flinkKafkaConsumer082, then hashpartition the data
No. I mean the KafkaSink.
A bit more insight to my program: I read from a Kafka topic with
flinkKafkaConsumer082, then hashpartition the data, then I do a deduplication
(does not eliminate all duplicates though). Then some computation, afterwards
again deduplication (group by message in a wind
Do you mean the KafkaSource?
Which KafkaSource are you using? The 0.9.1 FlinkKafkaConsumer082 or the
KafkaSource?
On Thu, Sep 3, 2015 at 1:10 PM, Rico Bergmann wrote:
> Hi!
>
> Testing it with the current 0.10 snapshot is not easily possible atm
>
> But I deactivated checkpointing in my program
Hi!
Testing it with the current 0.10 snapshot is not easily possible atm
But I deactivated checkpointing in my program and still get duplicates in my
output. So it seems not only to come from the checkpointing feature, or?
May be the KafkaSink is responsible for this? (Just my guess)
Cheers Ri
Hi Rico,
unfortunately the 0.9 branch still seems to have problems with exactly once
processing and checkpointed operators. We reworked how the checkpoints are
handled for the 0.10 release so it should work well there.
Could you maybe try running on the 0.10-SNAPSHOT release and see if the
problem
Hi!
I still have an issue... I was now using 0.9.1 and the new
KafkaConnector. But I still get duplicates in my flink prog. Here's the
relevant part:
final FlinkKafkaConsumer082 kafkaSrc = new
FlinkKafkaConsumer082(
kafkaTopicIn, new SimpleStringSchema(), properties);