This was resolved internally - I realised that we had a second consumer running 
against the topic which was committing offsets without setting a group.id or 
auto.commit to false.

Henry

> On 5 Sep 2018, at 08:28, Henry Thacker <he...@bath.edu> wrote:
> 
> Hi,
> 
> In testing - i’ve got a 5 node Kafka cluster with min.insync.replicas set to 
> 4. The cluster is currently running version 0.10.0.1. 
> 
> There’s an application that publishes to a topic - and on restart, it 
> attempts to read the full contents of the topic up until the high watermark 
> before then publishing extra messages to the topic. This particular topic 
> routinely contains between 600,000 and 5 million messages and consists of a 
> single partition, as we rely on precise ordering. To determine the offset for 
> the high-water mark, we do something similar to the following:
> 
> // Do not use offset management
> props.put(“enable.auto.commit”, false);
> props.put(“auto.offset.reset”, “latest’);
> 
> // ...
> 
> consumer.assign(tpList);
> consumer.poll(100);
> 
> long maxOffset = consumer.position(tp) - 1;
> 
> This has been working fine for months and then all of a sudden, we found that 
> maxOffset was returning an offset that could be 10s or even 1000s of offsets 
> away from the real high watermark. As far as I can tell, the cluster state 
> seems to be OK - there are no offline partitions or out of sync replicas when 
> we see this issue. We also are not using transactional messages in Kafka. 
> 
> We found that doing an explicit seekToEnd before checking the position seems 
> to help, eg:
> 
> // ...
> 
> consumer.assign(tpList);
> consumer.poll(100);
> 
> consumer.seekToEnd(tpList);
> long maxOffset = consumer.position(tp) - 1;
> 
> But I can’t understand why this is necessary - when everything seems to have 
> been working prior to this? I’m now worried the cluster is in a bad state, 
> and we’re not capturing the health status correctly or missing some error 
> messages somewhere. 
> 
> Keen for any ideas about what this might be or for things I can try.
> 
> Thanks in advance,
> Henry
> 
> 
> 
> 
> 
> 
> 
> 

Reply via email to