> On Apr 9, 2025, at 2:02 AM, kirin sairento <sairentoki...@gmail.com> wrote:
> 
> Hello teams
> 
> I have a GitHub repo to reproduce this 
> issue:https://github.com/nenryo/kafka_consumer_issue 
> 
> 
> 
> On Mon, Apr 7, 2025 at 10:13 AM kirin sairento <sairentoki...@gmail.com> 
> wrote:
> I suspect this issue is related to log segment boundaries. When dealing with 
> non-contiguous offsets that span across segments—for example:
> - Segment 1 contains offsets [0, 1, 2]
> - Segment 2 contains offsets [4, 6, 7]
> 
> The observed behavior is as follows:
> - seek(3) + poll() returns empty data
> - seek(5) + poll() successfully returns data
> 
> Is this a bug? My use case requires consuming from a specific position (e.g., 
> seek(msg.offset + 1)), but with non-contiguous offsets, the scenario 
> described above may occur.


I am no expert in this subject, but poll() is usually intended to be called in 
a loop. The KafkaConsumer docs also say that in some cases,
e.g. "if the position advances past control records or aborted transactions", 
it will return no results immediately.

I would expect this means that if the first poll returns no records quickly, 
then subsequent poll calls will return the records.

Similarly to how using wait() / notify() must be done in a loop because your 
thread can wake up for "no reason at all", I would expect that
your poll() loop must handle the case of "no records right now, please try 
again" in certain edge cases.

If the records are *never* returned, or there is a large latency, then that may 
be a different matter...

Best,
Steven
 

Reply via email to