If I understood well, this KIP is trying to solve for the problem of offsets.topic.replication.factor not being enforced, particularly in context of "when you have clients or tooling running as the cluster is getting setup". Assuming that this problem was observed in production, so in non-testing only conditions, would it make sense to introduce additional property - min number of alive brokers before offsets topic is allowed to be created?
Currently offsets.topic.replication.factor is used for that purpose, so with offsets.topic.replication.factor set to 3 it's enough to have just 3 brokers up for offsets topic to be created. Then all replicas of all (by default 50) partitions of this topic would be spread out over just these 3 brokers, while eventually entire cluster might be much larger in size and would benefit from wider spread of consumer offsets topic partitions leadership. One can achieve wider spread later, manually. But that would first have to be detected, and then use provided CLI/scripts to change replica assignment. IMO it would be better if it was possible to configure desired spread, even if just indirectly through configuring min number of alive brokers. If not overriden in server.properties, this new property can default to offsets.topic.replication.factor I've been bitten by problem of offsets.topic.replication.factor not being enforced but only in testing, integration tests, it was almost unpredictable when offsets topic is ready, test cluster initialized, would get lots of false failures, unstable tests, but eventually got to predictable deterministic test behavior, found ways to fully initialize test cluster. If this problem of offsets.topic.replication.factor not being enforced others also observed only in their tests only, than I don't like the KIP proposed change, of setting offsets.topic.replication.factor to 1 by default. I understand backward compatibility goals of this, but I can imagine late discovered production issues as consequences of this change. So I wouldn't like to trade off production issues probability for testing convenience. Current Kafka documentation has nice note about offsets.topic.replication.factor and related behavior. New note about new default would have to be a warning in bold and red in docs, and every broker should output proper warning in log if configuration for offsets.topic.replication.factor is on new proposed default of 1. Kind regards, Stevo Slavic. On Thu, Jan 26, 2017 at 8:43 AM, James Cheng <wushuja...@gmail.com> wrote: > > > On Jan 25, 2017, at 9:26 PM, Joel Koshy <jjkosh...@gmail.com> wrote: > > > > already voted, but one thing worth considering (since this KIP speaks of > > *enforcement*) is desired behavior if the topic already exists and the > > config != existing RF. > > > > Yeah, I'm curious about this too. > > -James > > > On Wed, Jan 25, 2017 at 4:30 PM, Dong Lin <lindon...@gmail.com> wrote: > > > >> +1 > >> > >> On Wed, Jan 25, 2017 at 4:22 PM, Ismael Juma <ism...@juma.me.uk> wrote: > >> > >>> An important question is if this needs to wait for a major release or > >> not. > >>> > >>> Ismael > >>> > >>> On Thu, Jan 26, 2017 at 12:19 AM, Ismael Juma <ism...@juma.me.uk> > wrote: > >>> > >>>> +1 from me too. > >>>> > >>>> Ismael > >>>> > >>>> On Thu, Jan 26, 2017 at 12:07 AM, Ewen Cheslack-Postava < > >>> e...@confluent.io > >>>>> wrote: > >>>> > >>>>> +1 > >>>>> > >>>>> Since this is an unusual one, I think it's worth pointing out that > the > >>> KIP > >>>>> notes it is really a bug fix, but since it has compatibility > >>> implications > >>>>> the KIP was worth it. It was a sort of intentional bug, but confusing > >>> and > >>>>> dangerous. > >>>>> > >>>>> Seems important to fix this ASAP since people are hitting this in > >>> practice > >>>>> and would have to go out of their way to set up monitoring to catch > >> the > >>>>> issue. > >>>>> > >>>>> -Ewen > >>>>> > >>>>> On Wed, Jan 25, 2017 at 4:02 PM, Jason Gustafson <ja...@confluent.io > > > >>>>> wrote: > >>>>> > >>>>>> +1 from me. The current behavior seems both surprising and > >> dangerous. > >>>>>> > >>>>>> -Jason > >>>>>> > >>>>>> On Wed, Jan 25, 2017 at 3:58 PM, Onur Karaman < > >>>>>> onurkaraman.apa...@gmail.com> > >>>>>> wrote: > >>>>>> > >>>>>>> Hey everyone. > >>>>>>> > >>>>>>> I made a bug-fix KIP-115 to enforce offsets.topic.replication. > >>> factor: > >>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- > >>>>>>> 115%3A+Enforce+offsets.topic.replication.factor > >>>>>>> > >>>>>>> Comments are welcome. > >>>>>>> > >>>>>>> - Onur > >>>>>>> > >>>>>> > >>>>> > >>>> > >>>> > >>> > >> > >