Re: [DISCUSS] KIP-78: Cluster Id

Ismael Juma Fri, 02 Sep 2016 20:30:07 -0700

Hi Dong,

Thanks for your feedback. Comments inline.

On Thu, Sep 1, 2016 at 7:51 PM, Dong Lin <lindon...@gmail.com> wrote:
>
> I share the view with Harsha and would like to understand how the current
> approach of randomly generating cluster.id compares with the approach of
> manually specifying it in meta.properties.
>

Harsha's suggestion in the thread was to store the generated id in
meta.properties, not to manually specify it via meta.properties.

>
> I think one big advantage of defining it manually in zookeeper is that we
> can easily tell which cluster it is by simply looking at the sensor name,
> which makes it more useful to the auditing or monitoring use-case that this
> KIP intends to address.

If you really want to customise the name, it is possible with the current
proposal: save the appropriate znode in ZooKeeper before a broker
auto-generates it. We don't encourage that because once you have a
meaningful name, there's a good chance that you may want to change it in
the future. And things break down at that point. That's why we prefer
having a generated, unique and immutable id complemented by a changeable
human readable name. As described in the KIP, we think the latter can be
achieved more generally via resource tags (which will be a separate KIP).

> On the other hand, if you can only tell whether two
> sensors are measuring the same cluster or not. Also note that even this
> goal is not easily guaranteed, because you need an external mechanism to
> manually re-generate znode with the old cluster.id if znode is deleted or
> if the same cluster (w.r.t purpose) is changed to use a different
> zookeeper.
>

If we assume that znodes can be deleted at random, the cluster id is
probably the least of one's worries. And yes, when moving to a
different ZooKeeper while wanting to retain the cluster id, you would have
to set the znode manually. This doesn't seem too onerous compared to the
other work you will have to do for this scenario.

> I read your reply to Harsha but still I don't fully understand your concern
> with that approach. I think the broker can simply register group.id in
> that
> znode if it is not specified yet, in the same way that this KIP proposes to
> do it, right? Can you please elaborate more about your concern with this
> approach?
>

It's a bit difficult to answer this comment because it seems like the
intent of your suggestion is different than Harsha's.

I am not necessarily opposed to storing the cluster id in meta.properties
(note that we have one meta.properties per log.dir), but I think there are
a number of things that need to be discussed and I don't think we need to
block KIP-78 while that takes place. Delivering features incrementally is a
good thing in my opinion (KIP-31/32, KIP-33 and KIP-79 is a good recent
example).

Ismael

P.S. For what is worth, the following version of the KIP includes an
incomplete description (it assumes a single meta.properties, but there
could be many) of what the broker would have to do if we wanted to save to
meta.properties and potentially restore the znode from it. The state space
becomes a lot more complex, increasing potential for bugs (we had a few for
generated broker ids). In contrast, the current proposal is very simple and
doesn't prevent us from introducing the additional functionality later.

https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65868433

Re: [DISCUSS] KIP-78: Cluster Id

Reply via email to