Its not just the config, you need to change your code. kafka.auto.commit.interval.ms=3000 means that consumers only commit offsets every 3 seconds so if there is any failure or rebalance they will reconsume up to 3 seconds of data per partition. That could be many hundreds or thousands of messages.
I would recommend you not use auto commit at all and instead manually commit offsets immediately after sending each email or batch of emails. -hans > On May 24, 2019, at 4:35 AM, ASHOK MACHERLA <iash...@outlook.com> wrote: > > Dear Team > > > > First of all thanks for reply on this issue. > > > > Right now we are using these configurations at consumer side > > > > kafka.max.poll.records=20 > > max.push.batch.size=100 > > enable.auto.commit=true > > auto.offset.reset=latest > > kafka.auto.commit.interval.ms=3000 > > kafka.session.timeout.ms=10000 > > kafka.request.timeout.ms=3000 > > kafka.heartbeat.interval.ms=3000 > > kafka.max.poll.interval.ms=300000 > > > > > > can you please suggest us to change the above config parameters . > > > > > > We are using one Kafka topic with 10 partitions and 10 consumers, we are > sending lakhs of emails to the customers , > > It’s enough that much partitions and consumer ?? > > > > Otherwise I have to increase that partitions and consumers ?? > > > > Please suggest us .. > > > > > > in consumer logs , its showing > > consumer group is rebalancing before committed because already group is > rebalancing > > > > Sent from Outlook. > > > > ________________________________ > From: Vincent Maurin <vincent.maurin...@gmail.com> > Sent: Friday, May 24, 2019 3:51:23 PM > To: users@kafka.apache.org > Subject: Re: Customers are getting same emails for roughly 30-40 times > > It also seems you are using "at least one" strategy (maybe with > auto-commit, or commiting after sending the email) > Maybe a "at most once" could be a valid business strategy here ? > > - at least once (you will deliver all the emails, but you could deliver > duplicates) > consumeMessages > sendEmails > commitOffsets > > - at most once (you will never deliver duplicates, but you might never > deliver an given email) > consumeMessages > commitOffsets > sendEmails > > Ideally, you could do "exactly once", but it is hard to achieve in the > scenario, Kafka -> External system. The usual strategy here is to have an > idempotent operation in combination with a "at least once" strategy > > Best, > Vincent > > On Fri, May 24, 2019 at 10:39 AM Liam Clarke <liam.cla...@adscale.co.nz> > wrote: > >> Consumers will rebalance if you add partitions, add consumers to the group >> or if a consumer leaves the group. >> >> Consumers will leave the group after not communicating with the server for >> a period set by session.timeout.ms. This is usually due to an exception in >> the code polling with the consumer, or message processing code taking too >> long. >> >> If your consumers are reprocessing messages thus causing emails to send, it >> implies that they weren't able to commit their offsets before >> failing/timing out. >> >> We had a similar issue in a database sink that consumed from Kafka and >> duplicated data because it took too long, and hit the session timeout, and >> then wasn't able to commits its offsets. >> >> So I'd look closely at your consuming code and log every possible source of >> exceptions. >> >> Kind regards, >> >> Liam Clarke >> >>> On Fri, 24 May 2019, 7:37 pm ASHOK MACHERLA, <iash...@outlook.com> wrote: >>> >>> Dear Team Member >>> >>> Currently we are using Kafka 0.10.1, zookeeper 3.4.6 versions. In our >>> project we have to send bulk emails to customers for this purpose we are >>> using Kafka cluster setup. >>> >>> But customers are getting same emails for roughly 30-40 times. This is >>> very worst thing. In this situation our consumer group is showing >>> rebalancing. Might be its could be reason ????? >>> Currently one topic we are using for this. We have 10 partitions and 10 >>> consumers. >>> I hope we have enough partitions and consumer as well. >>> But I don’t know exactly number of partitions & consumer are required to >>> overcome this issue. >>> >>> Can you please suggest us to fix this issue. >>> >>> If anything changes required in Kafka side as well as consumer side?? >>> How to stop rebalancing issue?????? >>> Please suggest us, Thanks >>> >>> >>> >>> Sent from Outlook. >>> >>> >>