Re: getting duplicate messages from duplicate jobs

2019-01-30 Thread Selvaraj chennappan
I have faced same problem . https://stackoverflow.com/questions/54286486/two-kafka-consumer-in-same-group-and-one-partition On Wed, Jan 30, 2019 at 6:11 PM Avi Levi wrote: > Ok, if you guys think it's should be like that then so be it. All I am > saying is that it is not standard behaviour from

Re: getting duplicate messages from duplicate jobs

2019-01-30 Thread Avi Levi
Ok, if you guys think it's should be like that then so be it. All I am saying is that it is not standard behaviour from kafka consumer, at least according to the documentation . I understand that flink implements things differently and all I

Re: getting duplicate messages from duplicate jobs

2019-01-28 Thread Tzu-Li (Gordon) Tai
Hi, Yes, Dawid is correct. The "group.id" setting in Flink's Kafka Consumer is only used for group offset fetching and committing offsets back to Kafka (only for exposure purposes, not used for processing guarantees). The Flink Kafka Consumer uses static partition assignment on the KafkaConsumer

Re: getting duplicate messages from duplicate jobs

2019-01-26 Thread Dawid Wysakowicz
Forgot to cc Gordon :) On 23/01/2019 18:02, Avi Levi wrote: > Hi,  > This quite confusing.  > I submitted the same stateless job twice (actually I upload it once). > However when I place a message on kafka, it seems that both jobs > consumes it, and publish the same result (we publish the result t

Re: getting duplicate messages from duplicate jobs

2019-01-26 Thread Dawid Wysakowicz
Hi Avi, AFAIK Flink's Kafka consumer uses low level Kafka APIs and do not participate in partition assignment protocol from Kafka, but it discovers all available partitions for given topic and manages offsets itself, what allows to provide exactly-once guarantees with regards to Flink's internal s

getting duplicate messages from duplicate jobs

2019-01-23 Thread Avi Levi
Hi, This quite confusing. I submitted the same stateless job twice (actually I upload it once). However when I place a message on kafka, it seems that both jobs consumes it, and publish the same result (we publish the result to other kafka topic, so I actually see the massage duplicated on kafka ).