On Thu, Apr 5, 2018 at 3:28 PM, Matthias J. Sax <matth...@confluent.io> wrote:
> Ewen, > > I cannot completely follow your argument. Can you elaborate a little > bit? After reading you mail, I am not sure if you prefer config > inheritance or not? And if, to what extend? > So we've had issues with config complexity in a couple of places. On the one hand I think having the ability to specify configs separately for different clients is important. In that regard namespacing the different configs is important. This is particularly important when you can have conflicting configs. This comes up frequently with interceptors because the producer and consumer use the same config name, but require classes that implement different interfaces. If you have both a producer and consumer, you need to have this namespacing. On the other hand, in the most common case where the user wants the same configs for everything (i.e. just pass the bootstrap brokers, or share common authentication configs), the namespacing becomes a pain. We have this issue in connect for worker, producer and consumer settings, where if you want to connect to a secured cluster you duplicate the info. So because of that I like the *idea* of inheritance, but all the proposals I've seen so far basically just say "use base configuration and add overrides". See notes below about why I don't think this is a good idea. > > > > In that mode if you override *anything*, you > >> should specify *everything*. > > Can you give an example for this? > Security is the most obvious example of this. If you're overriding any security config, you clearly have a different security setup than the parent and inheriting stuff is just confusing and requires digging through logs to see the "effective" config if something isn't working. One way to keep this simpler to understand is that you can use the same config from a base config entirely or, if you specify a single config with the special prefix then you *only* use configs. There might be cases where you duplicate the same settings, but the all or nothing approach makes it easy to understand what is happening and you still only pay the cost of larger configs if you need the advanced functionality of customizing different clients. The unfortunate side effect of this is that some things, e.g. bootstrap brokers, probably don't make sense to customize under the different types of clients but you'd still need to duplicate those settings. This is why I was saying I haven't really seen a clean solution to this. > > > > But if it was just, e.g. group.id, client.id, > >> and offset.reset that needed adjustment for a restoreConsumer, that > would > >> be the default and everything else is inherited. > > Don't understand this part. > > > > I think > >> inheritance in these configuration setups needs to be approached very > >> carefully. > > Agreed. I think that our approach is carefully designed though. > > > For follow up on Guozhang's argument, I agree with him, that for most > users the existing prefix would be good enough and the three new > prefixes are for advanced users. > This is maybe where the biggest gap in the proposal is. It just says: > For some use cases, it's required to set different configs for different consumers. Why? what are those use cases? What case isn't handled by the configs we have today? It seems weird that different consumers within the same app would need different client-side settings. -Ewen > > > > -Matthias > > On 4/5/18 11:37 AM, Guozhang Wang wrote: > > Ewen, thanks for your thoughts. > > > > Just to clarify the current KIP does not propose for four new prefixes, > but > > three plus the existing one. So it is > > > > "consumer" > > "main.consumer" > > "restore.consumer" > > "global.consumer" > > > > > > If we design the configs for the first time, I'd be in favor of only > > keeping the last three prefixes. But as of now we have a compatibility > > issue to consider. So the question is if it is worthwhile to break the > > compatibility and enforce users to make code changes. My rationale is > that: > > > > 1) for normal users they would not bother overriding configs for > different > > types of consumers, where "consumer" prefix is good enough for them; and > > today they probably have already made those overrides via "consumer". > > > > 2) for advanced users they would need some additional overrides for > > different types of consumers, and they would go ahead and learn about the > > other three prefixes and set them there. > > > > > > I agree that four prefixes would be more confusing, but if we think use > > case 1)'s popularity is much larger than use case 2), which by the way we > > can still debate on, then I'd argue it's better to not force normal user > > groups from 1) to make code changes to make advanced users from 2) less > > confused about the hierarchy. > > > > > > Guozhang > > > > > > > > > > > > > > On Wed, Apr 4, 2018 at 11:23 PM, Ewen Cheslack-Postava < > e...@confluent.io> > > wrote: > > > >> I think this model is more confusing than it needs to be. > >> > >> We end up with 4 prefixes despite only have 3 types of consumers. We > have > >> prefixes for: "base", "main", "global", and "restore". However, we only > >> instantiate consumers of type "main", "global", and "restore". > >> > >> Until now, we've only had two types of configs mapping to two types of > >> consumers, despite internally having some common shared configs as a > >> baseline to bootstrap the two "public" ones (see > >> StreamsConfig.getCommonConsumerConfigs). Do we want to complicate this > to > >> 4 > >> levels of "public" configs when there are only 3 types of concrete > configs > >> we instantiate? > >> > >> More generally, I worry that we're optimizing too much to avoid > copy/paste > >> in less common cases to the point that we would confuse users with yet > more > >> concepts before they can even write their configuration. What if we > took an > >> (perhaps modified) all or nothing approach to inheriting from the the > >> "main" consumer properties? In that mode if you override *anything*, you > >> should specify *everything*. But if it was just, e.g. group.id, > client.id, > >> and offset.reset that needed adjustment for a restoreConsumer, that > would > >> be the default and everything else is inherited. Same deal for a clearly > >> specified set of configs for global consumers that required override. > >> > >> I feel like I'm also concurrently seeing the opposite side of this > problem > >> in Connect where we namespaced and didn't proactively implement > >> inheritance; and while people find the config duplication annoying (and > >> confusing!), we inevitably find cases where they need it. I think > >> inheritance in these configuration setups needs to be approached very > >> carefully. Admittedly, some of the challenges in Connect don't appear > here > >> (e.g. conflicts in producer/consumer config naming, since this is a > >> Consumer-only KIP), but similar problems arise. > >> > >> -Ewen > >> > >> On Wed, Apr 4, 2018 at 10:56 PM, Boyang Chen <bche...@outlook.com> > wrote: > >> > >>> Thanks Guozhang! I already updated the pull request and KIP to > deprecate > >>> getConsumerConfigs() function. Do you think we could move to a voting > >> stage > >>> now? > >>> > >>> > >>> ________________________________ > >>> From: Guozhang Wang <wangg...@gmail.com> > >>> Sent: Thursday, April 5, 2018 9:52 AM > >>> To: dev@kafka.apache.org > >>> Subject: Re: [DISCUSS] KIP-276: Add StreamsConfig prefix for different > >>> consumers > >>> > >>> I agree that renaming the method in this case may not worth it. Let's > >> keep > >>> the existing function names. > >>> > >>> On Wed, Apr 4, 2018 at 6:06 PM, Matthias J. Sax <matth...@confluent.io > > > >>> wrote: > >>> > >>>> Thanks for updating the KIP. > >>>> > >>>> One more comment. Even if we don't expect users to call > >>>> `StreamsConfig#getConsumerConfigs()` it is still public API. Thus, we > >>>> cannot just rename the method. > >>>> > >>>> I think we have two options: either we keep the current name or we > >>>> deprecate the method and introduce `getMainConsumerConfigs()` in > >>>> parallel. The deprecated method would just call the new method. > >>>> > >>>> I am not sure if we gain a lot renaming the method, thus I have a > >> slight > >>>> preference in just keeping the method name as is. (But I am also ok to > >>>> rename it, if people prefer this) > >>>> > >>>> -Matthias > >>>> > >>>> > >>>> On 4/3/18 1:59 PM, Bill Bejeck wrote: > >>>>> Boyang, > >>>>> > >>>>> Thanks for making changes to the KIP, I'm +1 on the updated version. > >>>>> > >>>>> -Bill > >>>>> > >>>>> On Tue, Apr 3, 2018 at 1:14 AM, Boyang Chen <bche...@outlook.com> > >>> wrote: > >>>>> > >>>>>> Hey friends, > >>>>>> > >>>>>> > >>>>>> both KIP<https://cwiki.apache.org/confluence/display/KAFKA/KIP- > >>>>>> 276+Add+StreamsConfig+prefix+for+different+consumers> and pull > >>> request< > >>>>>> https://github.com/apache/kafka/pull/4805> are updated. Feel free > >> to > >>>> take > >>>>>> another look. > >>>>>> > >>>>>> > >>>>>> > >>>>>> Thanks for your valuable feedback! > >>>>>> > >>>>>> > >>>>>> Best, > >>>>>> > >>>>>> Boyang > >>>>>> > >>>>>> ________________________________ > >>>>>> From: Boyang Chen <bche...@outlook.com> > >>>>>> Sent: Tuesday, April 3, 2018 11:39 AM > >>>>>> To: dev@kafka.apache.org > >>>>>> Subject: Re: [DISCUSS] KIP-276: Add StreamsConfig prefix for > >> different > >>>>>> consumers > >>>>>> > >>>>>> Thanks Matthias, Ted and Guozhang for the inputs. I shall address > >> them > >>>> in > >>>>>> next round. > >>>>>> > >>>>>> > >>>>>> ________________________________ > >>>>>> From: Matthias J. Sax <matth...@confluent.io> > >>>>>> Sent: Tuesday, April 3, 2018 4:43 AM > >>>>>> To: dev@kafka.apache.org > >>>>>> Subject: Re: [DISCUSS] KIP-276: Add StreamsConfig prefix for > >> different > >>>>>> consumers > >>>>>> > >>>>>> Yes, your examples make sense to me. That was the idea behind the > >>>> proposal. > >>>>>> > >>>>>> > >>>>>> -Matthias > >>>>>> > >>>>>> On 4/2/18 11:25 AM, Guozhang Wang wrote: > >>>>>>> @Matthias > >>>>>>> > >>>>>>> That's a good question: I think adding another config for the main > >>>>>> consumer > >>>>>>> makes good tradeoffs between compatibility and new user > >> convenience. > >>>> Just > >>>>>>> to clarify, from user's pov on upgrade: > >>>>>>> > >>>>>>> 1) I'm already overriding some consumer configs, and now I want to > >>>>>> override > >>>>>>> these values differently for restore consumers, I'd add one new > >> line > >>>> for > >>>>>>> the restore consumer prefix. > >>>>>>> > >>>>>>> 2) I'm already overriding some consumer configs, and now I want to > >>> NOT > >>>>>>> overriding them for restore consumers, I'd change my override from > >>>>>>> `consumer.X` to `main.consumer.X`. > >>>>>>> > >>>>>>> 3) I'm new and have not any consumer overrides, and now if I want > >> to > >>>>>>> override some, I'd use `main.consumer`, `restore.consumer` for > >>> specific > >>>>>>> consumer types, and ONLY consider `consumer` for the ones that I > >> want > >>>> to > >>>>>>> apply universally. > >>>>>>> > >>>>>>> 4) I'm already overriding some consumer configs and I'm happy with > >>>> what I > >>>>>>> get, I do not change anything. > >>>>>>> > >>>>>>> > >>>>>>> Guozhang > >>>>>>> > >>>>>>> On Mon, Apr 2, 2018 at 11:10 AM, Ted Yu <yuzhih...@gmail.com> > >> wrote: > >>>>>>> > >>>>>>>> bq. to introduce one more prefix `main.consumer.` > >>>>>>>> > >>>>>>>> Makes sense. > >>>>>>>> > >>>>>>>> On Mon, Apr 2, 2018 at 11:06 AM, Matthias J. Sax < > >>>> matth...@confluent.io > >>>>>>> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Boyang, > >>>>>>>>> > >>>>>>>>> thanks a lot for the KIP. > >>>>>>>>> > >>>>>>>>> Couple of questions: > >>>>>>>>> > >>>>>>>>>> (MODIFIED) public Map<String, Object> getRestoreConsumerConfigs( > >>>> final > >>>>>>>>> String clientId); > >>>>>>>>> > >>>>>>>>> You mean that the implementation/semantics of this method > >> changes, > >>>> but > >>>>>>>>> not the API itself? Might be good to add this as "comment style" > >>>>>> instead > >>>>>>>>> of "(MODIFIED)" prefix. > >>>>>>>>> > >>>>>>>>>> By rewriting the getRestoreConsumerConfigs() function and adding > >>> the > >>>>>>>>> getGlobalConsumerConfigs() function, if one user uses > >>>>>>>>> restoreConsumerPrefix() or globalConsumerPrefix() when adding new > >>>>>>>>> configurations, the configs shall overwrite base consumer config. > >>> If > >>>>>> not > >>>>>>>>> specified, restore consumer and global consumer shall share the > >>> same > >>>>>>>> config > >>>>>>>>> with base consumer. > >>>>>>>>> > >>>>>>>>> While this does make sense for backward compatibility, I am > >> wonder > >>> if > >>>>>> it > >>>>>>>>> makes the config "inheritance logic" (ie, hierarchy) too complex? > >>> We > >>>>>>>>> basically introduce a second level of overwrites. It might be > >>> simpler > >>>>>> to > >>>>>>>>> not introduce this hierarchy with the cost to break backward > >>>>>>>> compatibility. > >>>>>>>>> > >>>>>>>>> For example, config `request.timeout.ms`: > >>>>>>>>> > >>>>>>>>> User sets `request.timeout.ms=<user-value>` > >>>>>>>>> To change it for the main consumer, user also sets > >>>>>>>>> `consumer.request.timeout.ms=<consumer-value>` > >>>>>>>>> > >>>>>>>>> If user only wants to change the config for main consumer, but > >> not > >>>> for > >>>>>>>>> global or restore consumer, user needs to add two more configs: > >>>>>>>>> > >>>>>>>>> `restore.consumer.request.timeout.ms=<user-value>` > >>>>>>>>> and > >>>>>>>>> `global.consumer.request.timeout.ms=<user-value>` > >>>>>>>>> > >>>>>>>>> to reset both back to the default config. IMHO, this is not an > >>>> optimal > >>>>>>>>> user experience. Thus, it might be worth to change the semantics > >>> for > >>>>>>>>> `consumer.` prefix to only apply those configs to the main > >>> consumer. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Not sure what other think what the better solution is (I am not > >>> sure > >>>> by > >>>>>>>>> myself to be honest---just wanted to point it out and discuss the > >>>>>>>>> pros/cons for both). > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Another though would be, to introduce one more prefix > >>>> `main.consumer.` > >>>>>>>>> -- using this, the existing `consumer.` prefix would apply to all > >>>>>>>>> consumers (keeping it's current semantics) while we have > >> overwrites > >>>> for > >>>>>>>>> all three consumers -- this allow to directly set `main.consumer` > >>>>>>>>> instead of `consumer` avoiding the weird pattern from my example > >>>> above > >>>>>>>>> and preserves backward compatibility. Ie, if we introduce an > >>>> hierarchy > >>>>>>>>> of overwrite, a "full" hierarchy might be better than a "partial" > >>>>>>>>> hierarchy. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Looking forward to your thoughts. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -Matthias > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On 4/1/18 5:55 PM, Guozhang Wang wrote: > >>>>>>>>>> Thanks for the KIP Boyang, the KIP looks good to me. > >>>>>>>>>> > >>>>>>>>>> For config values, we use underscore for keeping a single word; > >>> for > >>>>>>>>> config > >>>>>>>>>> keys, though, we do not use underscores or dashes. I'd suggest > >>> using > >>>>>>>> dots > >>>>>>>>>> to be consistent with others. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Otherwise I'm +1 on the KIP. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Guozhang > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Sun, Apr 1, 2018 at 10:56 AM, Ted Yu <yuzhih...@gmail.com> > >>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Looks good overall. > >>>>>>>>>>> > >>>>>>>>>>> public static final String RESTORE_CONSUMER_PREFIX = > >>>>>>>>> "restore-consumer."; > >>>>>>>>>>> > >>>>>>>>>>> For other constants in StreamsConfig, underscore is used > >> instead > >>> of > >>>>>>>>> dash. > >>>>>>>>>>> > >>>>>>>>>>> Cheers > >>>>>>>>>>> > >>>>>>>>>>> On Sun, Apr 1, 2018 at 9:38 AM, Boyang Chen < > >> bche...@outlook.com > >>>> > >>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> Hey friends, > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> I would like to start a discussion thread for KIP 276: > >>>>>>>>>>>> > >>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- > >>>>>>>>>>>> 276+Add+StreamsConfig+prefix+for+different+consumers > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> And pull request is here: > >>>>>>>>>>>> > >>>>>>>>>>>> https://github.com/apache/kafka/pull/4805 > >>>>>>>>>>>> > >>>>>>>>>>>> [https://avatars3.githubusercontent.com/u/5845561?s=400&v=4 > >>>>>>>> ]<https:// > >>>>>>>>>>>> github.com/apache/kafka/pull/4805> > >>>>>>>>>>>> > >>>>>>>>>>>> KAFKA-6657: Add StreamsConfig prefix for different consumers > >> by > >>>>>>>>> abbccdda > >>>>>>>>>>> · > >>>>>>>>>>>> Pull Request #4805 · apache/kafka<https://github. > >>>>>>>>>>>> com/apache/kafka/pull/4805> > >>>>>>>>>>>> github.com > >>>>>>>>>>>> This pull request is for jira 6657. The KIP proposal is here > >>> Added > >>>>>>>> unit > >>>>>>>>>>>> tests for new getGlobalConsumerConfigs API and make sure > >>> existing > >>>>>>>>> restore > >>>>>>>>>>>> consumer tests are passing. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Thanks, > >>>>>>>>>>>> > >>>>>>>>>>>> Boyang > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>> > >>>> > >>>> > >>> > >>> > >>> -- > >>> -- Guozhang > >>> > >> > > > > > > > >