> On Apr 9, 2025, at 2:02 AM, kirin sairento <sairentoki...@gmail.com> wrote: > > Hello teams > > I have a GitHub repo to reproduce this > issue:https://github.com/nenryo/kafka_consumer_issue > > > > On Mon, Apr 7, 2025 at 10:13 AM kirin sairento <sairentoki...@gmail.com> > wrote: > I suspect this issue is related to log segment boundaries. When dealing with > non-contiguous offsets that span across segments—for example: > - Segment 1 contains offsets [0, 1, 2] > - Segment 2 contains offsets [4, 6, 7] > > The observed behavior is as follows: > - seek(3) + poll() returns empty data > - seek(5) + poll() successfully returns data > > Is this a bug? My use case requires consuming from a specific position (e.g., > seek(msg.offset + 1)), but with non-contiguous offsets, the scenario > described above may occur.
I am no expert in this subject, but poll() is usually intended to be called in a loop. The KafkaConsumer docs also say that in some cases, e.g. "if the position advances past control records or aborted transactions", it will return no results immediately. I would expect this means that if the first poll returns no records quickly, then subsequent poll calls will return the records. Similarly to how using wait() / notify() must be done in a loop because your thread can wake up for "no reason at all", I would expect that your poll() loop must handle the case of "no records right now, please try again" in certain edge cases. If the records are *never* returned, or there is a large latency, then that may be a different matter... Best, Steven