If I understood well, this KIP is trying to solve for the problem of
offsets.topic.replication.factor not being enforced, particularly in
context of  "when you have clients or tooling running as the cluster is
getting setup". Assuming that this problem was observed in production, so
in non-testing only conditions, would it make sense to introduce additional
property - min number of alive brokers before offsets topic is allowed to
be created?

Currently offsets.topic.replication.factor is used for that purpose, so
with offsets.topic.replication.factor set to 3 it's enough to have just 3
brokers up for offsets topic to be created. Then all replicas of all (by
default 50) partitions of this topic would be spread out over just these 3
brokers, while eventually entire cluster might be much larger in size and
would benefit from wider spread of consumer offsets topic partitions
leadership.

One can achieve wider spread later, manually. But that would first have to
be detected, and then use provided CLI/scripts to change replica
assignment. IMO it would be better if it was possible to configure desired
spread, even if just indirectly through configuring min number of alive
brokers. If not overriden in server.properties, this new property can
default to offsets.topic.replication.factor

I've been bitten by problem of offsets.topic.replication.factor not being
enforced but only in testing, integration tests, it was almost
unpredictable when offsets topic is ready, test cluster initialized, would
get lots of false failures, unstable tests, but eventually got to
predictable deterministic test behavior, found ways to fully initialize
test cluster. If this problem of offsets.topic.replication.factor not being
enforced others also observed only in their tests only, than I don't like
the KIP proposed change, of setting offsets.topic.replication.factor to 1
by default. I understand backward compatibility goals of this, but I can
imagine late discovered production issues as consequences of this change.
So I wouldn't like to trade off production issues probability for testing
convenience.

Current Kafka documentation has nice note about
offsets.topic.replication.factor and related behavior. New note about new
default would have to be a warning in bold and red in docs, and every
broker should output proper warning in log if configuration for
offsets.topic.replication.factor is on new proposed default of 1.

Kind regards,
Stevo Slavic.

On Thu, Jan 26, 2017 at 8:43 AM, James Cheng <wushuja...@gmail.com> wrote:

>
> > On Jan 25, 2017, at 9:26 PM, Joel Koshy <jjkosh...@gmail.com> wrote:
> >
> > already voted, but one thing worth considering (since this KIP speaks of
> > *enforcement*) is desired behavior if the topic already exists and the
> > config != existing RF.
> >
>
> Yeah, I'm curious about this too.
>
> -James
>
> > On Wed, Jan 25, 2017 at 4:30 PM, Dong Lin <lindon...@gmail.com> wrote:
> >
> >> +1
> >>
> >> On Wed, Jan 25, 2017 at 4:22 PM, Ismael Juma <ism...@juma.me.uk> wrote:
> >>
> >>> An important question is if this needs to wait for a major release or
> >> not.
> >>>
> >>> Ismael
> >>>
> >>> On Thu, Jan 26, 2017 at 12:19 AM, Ismael Juma <ism...@juma.me.uk>
> wrote:
> >>>
> >>>> +1 from me too.
> >>>>
> >>>> Ismael
> >>>>
> >>>> On Thu, Jan 26, 2017 at 12:07 AM, Ewen Cheslack-Postava <
> >>> e...@confluent.io
> >>>>> wrote:
> >>>>
> >>>>> +1
> >>>>>
> >>>>> Since this is an unusual one, I think it's worth pointing out that
> the
> >>> KIP
> >>>>> notes it is really a bug fix, but since it has compatibility
> >>> implications
> >>>>> the KIP was worth it. It was a sort of intentional bug, but confusing
> >>> and
> >>>>> dangerous.
> >>>>>
> >>>>> Seems important to fix this ASAP since people are hitting this in
> >>> practice
> >>>>> and would have to go out of their way to set up monitoring to catch
> >> the
> >>>>> issue.
> >>>>>
> >>>>> -Ewen
> >>>>>
> >>>>> On Wed, Jan 25, 2017 at 4:02 PM, Jason Gustafson <ja...@confluent.io
> >
> >>>>> wrote:
> >>>>>
> >>>>>> +1 from me. The current behavior seems both surprising and
> >> dangerous.
> >>>>>>
> >>>>>> -Jason
> >>>>>>
> >>>>>> On Wed, Jan 25, 2017 at 3:58 PM, Onur Karaman <
> >>>>>> onurkaraman.apa...@gmail.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hey everyone.
> >>>>>>>
> >>>>>>> I made a bug-fix KIP-115 to enforce offsets.topic.replication.
> >>> factor:
> >>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>>>>> 115%3A+Enforce+offsets.topic.replication.factor
> >>>>>>>
> >>>>>>> Comments are welcome.
> >>>>>>>
> >>>>>>> - Onur
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
>
>

Reply via email to