Hi Dong, Thanks for your feedback. Comments inline.
On Thu, Sep 1, 2016 at 7:51 PM, Dong Lin <lindon...@gmail.com> wrote: > > I share the view with Harsha and would like to understand how the current > approach of randomly generating cluster.id compares with the approach of > manually specifying it in meta.properties. > Harsha's suggestion in the thread was to store the generated id in meta.properties, not to manually specify it via meta.properties. > > I think one big advantage of defining it manually in zookeeper is that we > can easily tell which cluster it is by simply looking at the sensor name, > which makes it more useful to the auditing or monitoring use-case that this > KIP intends to address. If you really want to customise the name, it is possible with the current proposal: save the appropriate znode in ZooKeeper before a broker auto-generates it. We don't encourage that because once you have a meaningful name, there's a good chance that you may want to change it in the future. And things break down at that point. That's why we prefer having a generated, unique and immutable id complemented by a changeable human readable name. As described in the KIP, we think the latter can be achieved more generally via resource tags (which will be a separate KIP). > On the other hand, if you can only tell whether two > sensors are measuring the same cluster or not. Also note that even this > goal is not easily guaranteed, because you need an external mechanism to > manually re-generate znode with the old cluster.id if znode is deleted or > if the same cluster (w.r.t purpose) is changed to use a different > zookeeper. > If we assume that znodes can be deleted at random, the cluster id is probably the least of one's worries. And yes, when moving to a different ZooKeeper while wanting to retain the cluster id, you would have to set the znode manually. This doesn't seem too onerous compared to the other work you will have to do for this scenario. > I read your reply to Harsha but still I don't fully understand your concern > with that approach. I think the broker can simply register group.id in > that > znode if it is not specified yet, in the same way that this KIP proposes to > do it, right? Can you please elaborate more about your concern with this > approach? > It's a bit difficult to answer this comment because it seems like the intent of your suggestion is different than Harsha's. I am not necessarily opposed to storing the cluster id in meta.properties (note that we have one meta.properties per log.dir), but I think there are a number of things that need to be discussed and I don't think we need to block KIP-78 while that takes place. Delivering features incrementally is a good thing in my opinion (KIP-31/32, KIP-33 and KIP-79 is a good recent example). Ismael P.S. For what is worth, the following version of the KIP includes an incomplete description (it assumes a single meta.properties, but there could be many) of what the broker would have to do if we wanted to save to meta.properties and potentially restore the znode from it. The state space becomes a lot more complex, increasing potential for bugs (we had a few for generated broker ids). In contrast, the current proposal is very simple and doesn't prevent us from introducing the additional functionality later. https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65868433