One thought I want to bring up about switching optimization on/off: While for the initial release, a boolean flag seems to be sufficient, I could imagine that we apply different and potentially upgrade-incompatible optimizations in future releases. Thus, to me it would make sense to use a String type, to indicate what optimizations are possible based on the release. For example, in next release we accept `null` for disabled and "2.0". If there are any incompatible changes, people can stay with "2.0" optimizations level when upgrading to "2.1" while new applications can use "2.1" optimization level. Old applications would need to do an offline upgrade to get "2.1" optimizations.
I agree with Bill, that switching individual optimizations on/off is too fine grained and hard to maintain. However, for compatibility, it might make sense, to have certain "levels of optimizations" (based on the release) that allow users to stay with on an older level for upgrade purpose. Using the release numbers to encode those "levels" is easy to understand for users and should minimize the mental effort to get the config right. Let me know what you think about this. -Matthias On 5/8/18 3:08 PM, Ted Yu wrote: > Bill:That makes sense. > Using boolean should suffice. > -------- Original message --------From: Bill Bejeck <bbej...@gmail.com> Date: > 5/8/18 2:48 PM (GMT-08:00) To: dev@kafka.apache.org Subject: Re: [DISCUSS] > KIP-295: Add Streams Configuration Allowing for Optional Topology > Optimization > Thanks for the comments Guozhang and Ted. > > Guozhang: > > 1) I'll update the KIP in the "Compatibility, Deprecation and Migration > Plan" with the expected impact of turning on optimization. But at this > point, I have not identified a migration plan that doesn't involve having > to stop all instances and restart. > > 2) Setting the type to String was just so we could have the default of > null, indicating run no optimizations. As for partially enabling > optimizations, I'm not sure I had that in mind, at least at this point. > To me having the topology optimized should be an "all or nothing" > proposition. For now, I'll change the type to boolean (with a default > value of false) to better reflect the intent of the configuration. > > Ted, thanks again for the comments. > > The intent of the new configuration, as I mentioned above, is whether to > turn optimization on or off in aggregate. The main reason for having the > configuration is for backward compatibility. As optimization may result in > a slightly different topology from the original one, we need to allow users > to turn it off until they are ready for migrating to the new topology. > > I don't think having to select each optimization is a feasible solution. I > say this for few reasons: > > 1. Maintenance will be an issue. While the initial target is only a > small number of optimizations, but as the number grows, having to maintain > the number of options in the code will difficult at best. > 2. Users probably don't want to reason about which combination of > optimizations to have. Thinking about which optimizations to turn on or > off raises the complexity of deploying an application. > 3. When using a relational database or other another framework that may > optimize queries, as far as I know, it's not a choice of which > optimizations to have, they are performed automatically. > > Does this make sense? > > > On Mon, May 7, 2018 at 7:49 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >> There are 4 subtasks for KAFKA-6034 >> >> If each optimization can be switched on/off, there should be 4 enums for >> the switch. >> >> On Mon, May 7, 2018 at 4:39 PM, Guozhang Wang <wangg...@gmail.com> wrote: >> >>> Hello Bill, >>> >>> Thanks for the KIP. My comments are the following: >>> >>> 1) We should state clearly what are the expected impact in >> "Compatibility, >>> Deprecation, and Migration Plan" when optimization is turned on. >>> >>> 2) I'm wondering why we define this config as a String typed, rather than >>> boolean typed? Or are you expecting to extend it to allow partially >>> enabling part of the optimizations in the future? >>> >>> >>> Hi Ted, >>> >>> The cover JIRA for topology optimization can be found here: >>> https://issues.apache.org/jira/browse/KAFKA-6034 >>> >>> >>> Guozhang >>> >>> >>> On Mon, May 7, 2018 at 11:04 AM, Ted Yu <yuzhih...@gmail.com> wrote: >>> >>>> Which JIRA is for the Topology Optimization itself ? >>>> >>>> Thanks >>>> >>>> On Mon, May 7, 2018 at 10:26 AM, Bill Bejeck <bbej...@gmail.com> >> wrote: >>>> >>>>> All, >>>>> I'd like to start a discussion about adding a configuration parameter >>>>> allowing for the forthcoming topology optimization to be optional via >>>>> configuration. >>>>> >>>>> >>>>> The KIP can be found here: >>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >>>> 295%3A+Add+Streams+ >>>>> Configuration+Allowing+for+Optional+Topology+Optimization >>>>> >>>>> Thanks, >>>>> Bill >>>>> >>>> >>> >>> >>> >>> -- >>> -- Guozhang >>> >>
signature.asc
Description: OpenPGP digital signature