[ https://issues.apache.org/jira/browse/KAFKA-519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Manikumar resolved KAFKA-519. ----------------------------- Resolution: Auto Closed Closing inactive issue. The old consumer is no longer supported. > Allow commiting the state of single KafkaStream > ----------------------------------------------- > > Key: KAFKA-519 > URL: https://issues.apache.org/jira/browse/KAFKA-519 > Project: Kafka > Issue Type: Improvement > Affects Versions: 0.7, 0.7.1 > Reporter: Esko Suomi > Priority: Minor > > Currently consuming multiple topics through ZK by first acquiring > ConsumerConnector and then fetching message streams for wanted topics. And > when the messages have been consumed, the current consuming state is commited > with the method ConsumerConnector#commitOffsets(). > This scheme has a flaw when the consuming application is used as sort of a > data piping proxy instead of final consuming sink. In our case we read data > from Kafka, repackage it and only then move it to persistent storage. The > repackaging step is relatively long running and may span several hours > (usually a few minutes) which in addition is mixed with highly asymmetric > topic throughputs; one of our topics gets about 80% of total throughput. We > have about 20 topics in total. As an unwanted side effect of all this, > commiting the offset whenever the per-topic persistence step has been taken > means commiting offsets for other topics too which may eventually manifest as > loss of data if the consuming application or the machine it is running on > crashes. > So, while this loss of data can be alleviated to some extent with for example > local temp storage, it would be cleaner if KafkaStream itself would allow for > partition level offset commiting. -- This message was sent by Atlassian JIRA (v6.4.14#64029)