> On Jun 10, 2015, at 1:26 PM, Todd Palino <tpal...@gmail.com> wrote:
> 
> For us, group ID is a configuration parameter of the application. So we
> store it in configuration files (generally on disk) and maintain it there
> through our configuration and deployment infrastructure. As you pointed
> out, hard coding the group ID into the application is not usually a good
> pattern.
> 
> If you want to reset, you have a couple choices. One is that you can just
> switch group names and start fresh. Another is that you can shut down the
> consumer and delete the existing consumer group, then restart. You could
> also stop, edit the offsets to set them to something specific (if you need
> to roll back to a specific point, for example), and restart.
> 

Thanks Todd. That helps. The "on disk" storage doesn't work well if you are 
running consumers in ephemeral nodes like EC2 machines, but in that case, I 
guess you would save the group ID in some other data store ("on disk, but 
elsewhere") associated with your "application cluster" rather than any one node 
of the cluster.

I often hear about people saving their offsets using the consumer, and 
monitoring offsets for lag. I don't hear much about people deleting or 
changing/setting offsets by other means. How is it usually done? Are there 
tools to change the offsets, or do people go into zookeeper to change them 
directly? Or, for broker-stored offsets, use the Kafka APIs?

-James

> -Todd
> 
> 
> On Wed, Jun 10, 2015 at 1:20 PM, James Cheng <jch...@tivo.com> wrote:
> 
>> Hi,
>> 
>> How are people specifying/persisting/resetting the consumer group
>> identifier ("group.id") when using the high-level consumer?
>> 
>> I understand how it works. I specify some string and all consumers that
>> use that same string will help consume a topic. The partitions will be
>> distributed amongst them for consumption. And when they save their offsets,
>> the offsets will be saved according to the consumer group. That all makes
>> sense to me.
>> 
>> What I don't understand is the best way to set and persist them, and reset
>> them if needed. For example, do I simply hardcode the string in my code? If
>> so, then all deployed instances will have the same value (that's good). If
>> I want to bring up a test instance of that code, or a new installation,
>> though, then it will also share the load (that's bad).
>> 
>> If I pass in a value to my instances, that lets me have different test and
>> production instances of the same code (that's good), but then I have to
>> persist my consumer group id somewhere outside of the process (on disk, in
>> zookeeper, etc). Which then means I need some way to manage *that*
>> identifier (that's... just how it is?).
>> 
>> What if I decide that I want my app to start over? In the case of
>> log-compacted streams, I want to throw away any processing I did and start
>> "from the beginning". Do I change my consumer group, which effective resets
>> everything? Or do I delete my saved offsets, and then resume with the same
>> consumer group? The latter is functionally equivalent to the former.
>> 
>> Thanks,
>> -James
>> 
>> 

Reply via email to