This is actually an expected consequence of using distributed systems. The
kafka FAQ has a good answer

https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-HowdoIgetexactly-oncemessagingfromKafka
?

On Tue, Jun 16, 2015 at 11:06 PM, Kris K <squareksc...@gmail.com> wrote:

> Hi,
>
> While testing message delivery using kafka, I realized that few duplicate
> messages got delivered by the consumers in the same consumer group (two
> consumers got the same message with few milli-seconds difference). However,
> I do not see any redundancy at the producer or broker. One more observation
> is that - this is not happening when I use only one consumer thread.
>
> I am running 3 brokers (0.8.2.1) with 3 Zookeeper nodes. There are 3
> partitions in the topic and replication-factor is 3. For producing, am
> using New Producer with compression.type=none.
>
> On the consumer end, I have 3 High level consumers in the same consumer
> group running with one consumer thread each, on three different hosts. Auto
> commit is set to true for consumer.
>
> Size of each message would range anywhere between 0.7 KB and  2 MB. The max
> volume for this test is 100 messages/hr.
>
> I looked at controller log for any possibility of consumer rebalance during
> this time, but did not find any. In the server log of all the brokers the
> error - java.io.IOException: Connection reset by peer is almost being
> written continuously.
>
> So, is it possible to achieve exactly-once delivery with the current high
> level consumer without needing an extra layer to remove redundancy?
>
> Could you please point me to any settings or logs that would help me tune
> the configuration ?
>
> *PS: I tried searching for similar discussions, but could not find any. If
> its already been answered, please provide the link.
>
> Thanks,
> Kris
>



-- 
Adam Shannon | Software Engineer | Banno | Jack Henry
206 6th Ave Suite 1020 | Des Moines, IA 50309 | Cell: 515.867.8337

Reply via email to