GitHub user squito opened a pull request: https://github.com/apache/kafka/pull/10
commitOffsets can be passed the offsets to commit This adds another version of `commitOffsets` that takes the offsets to commit as a parameter. Without this change, getting correct user code is very hard. Despite kafka's at-least-once guarantees, most user code doesn't actually have that guarantee, and is almost certainly wrong if doing batch processing. Getting it right requires some very careful synchronization between all consumer threads, which is both: 1) painful to get right 2) slow b/c of the need to stop all workers during a commit. This small change simplifies a lot of this. This was discussed extensively on the user mailing list, on the thread "are kafka consumer apps guaranteed to see msgs at least once?" You can also see an example implementation of a user api which makes use of this, to get proper at-least-once guarantees by *user* code, even for batches: https://github.com/quantifind/kafka-utils/pull/1 I'm open to any suggestions on how to add unit tests for this. You can merge this pull request into a Git repository by running: $ git pull https://github.com/squito/kafka commitOffsets_param Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/10.patch ---- commit cc351474f05618ec3424e98eb33bc36b1abf05a5 Author: Imran Rashid <im...@quantifind.com> Date: 2013-11-21T20:51:12Z allow committing of arbitrary offsets, to facilitate batch processing commit 81bb36b5652ce3fa208dc7221aa00d69ceb49d7e Author: Imran Rashid <im...@quantifind.com> Date: 2013-11-25T01:32:22Z add protection against backward commits ----