Jun Rao created KAFKA-18641:
-------------------------------

             Summary: AsyncKafkaConsumer could lose records with auto offset 
commit
                 Key: KAFKA-18641
                 URL: https://issues.apache.org/jira/browse/KAFKA-18641
             Project: Kafka
          Issue Type: Bug
          Components: consumer
    Affects Versions: 4.0.0
            Reporter: Jun Rao


In the new AsyncKafkaConsumer, the application thread will keep updating the 
auto commit timer through PollEvent. In the consumer network thread, once the 
timer has expired, it generates an offset commit request with the current 
offset position in subscriptions. However, at this point, the records before 
that offset could just be polled from FetchBuffer, but not actually consumed by 
the application. If the application dies immediately, those records may never 
be consumed by the application since the offset could have been committed.

The ClassicKafkaConsumer doesn't seem to have this problem. In each poll() 
call, before fetching new records, it first calls ConsumerCoordinator.poll(), 
which generates an OffsetCommitRequest with the current offset position in 
subscriptions. Since this is done in the same application thread, it guarantees 
that all records returned in the previous poll() have been processed. The 
problem exists in AsyncKafkaConsumer because the polling of new records and 
committing offsets are done in separate threads.

 

This problem exists for async offset commit and auto offset commit during 
rebalance in AsyncKafkaConsumer too.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to