What Todd said :) (I think my ops background is showing...)
On Mon, May 11, 2015 at 10:17 PM, Todd Palino <tpal...@gmail.com> wrote: > I understand your point here, Jay, but I disagree that we can't have two > configuration systems. We have two different types of configuration > information. We have configuration that relates to the service itself (the > Kafka broker), and we have configuration that relates to the content within > the service (topics). I would put the client configuration (quotas) in the > with the second part, as it is dynamic information. I just don't see a good > argument for effectively degrading the configuration for the service > because of trying to keep it paired with the configuration of dynamic > resources. > > -Todd > > On Mon, May 11, 2015 at 11:33 AM, Jay Kreps <jay.kr...@gmail.com> wrote: > > > I totally agree that ZK is not in-and-of-itself a configuration > management > > solution and it would be better if we could just keep all our config in > > files. Anyone who has followed the various config discussions over the > past > > few years of discussion knows I'm the biggest proponent of immutable > > file-driven config. > > > > The analogy to "normal unix services" isn't actually quite right though. > > The problem Kafka has is that a number of the configurable entities it > > manages are added dynamically--topics, clients, consumer groups, etc. > What > > this actually resembles is not a unix services like HTTPD but a database, > > and databases typically do manage config dynamically for exactly the same > > reason. > > > > The last few emails are arguing that files > ZK as a config solution. I > > agree with this, but that isn't really the question, right?The reality is > > that we need to be able to configure dynamically created entities and we > > won't get a satisfactory solution to that using files (e.g. rsync is not > an > > acceptable topic creation mechanism). What we are discussing is having a > > single config mechanism or multiple. If we have multiple you need to > solve > > the whole config lifecycle problem for both--management, audit, rollback, > > etc. > > > > Gwen, you were saying we couldn't get rid of the configuration file, not > > sure if I understand. Is that because we need to give the URL for ZK? > > Wouldn't the same argument work to say that we can't use configuration > > files because we have to specify the file path? I think we can just give > > the server the same --zookeeper argument we use everywhere else, right? > > > > -Jay > > > > On Sun, May 10, 2015 at 11:28 AM, Todd Palino <tpal...@gmail.com> wrote: > > > > > I've been watching this discussion for a while, and I have to jump in > and > > > side with Gwen here. I see no benefit to putting the configs into > > Zookeeper > > > entirely, and a lot of downside. The two biggest problems I have with > > this > > > are: > > > > > > 1) Configuration management. OK, so you can write glue for Chef to put > > > configs into Zookeeper. You also need to write glue for Puppet. And > > > Cfengine. And everything else out there. Files are an industry standard > > > practice, they're how just about everyone handles it, and there's > reasons > > > for that, not just "it's the way it's always been done". > > > > > > 2) Auditing. Configuration files can easily be managed in a source > > > repository system which tracks what changes were made and who made > them. > > It > > > also easily allows for rolling back to a previous version. Zookeeper > does > > > not. > > > > > > I see absolutely nothing wrong with putting the quota (client) configs > > and > > > the topic config overrides in Zookeeper, and keeping everything else > > > exactly where it is, in the configuration file. To handle > configurations > > > for the broker that can be changed at runtime without a restart, you > can > > > use the industry standard practice of catching SIGHUP and rereading the > > > configuration file at that point. > > > > > > -Todd > > > > > > > > > On Sun, May 10, 2015 at 4:00 AM, Gwen Shapira <gshap...@cloudera.com> > > > wrote: > > > > > > > I am still not clear about the benefits of managing configuration in > > > > ZooKeeper vs. keeping the local file and adding a "refresh" mechanism > > > > (signal, protocol, zookeeper, or other). > > > > > > > > Benefits of staying with configuration file: > > > > 1. In line with pretty much any Linux service that exists, so admins > > > have a > > > > lot of related experience. > > > > 2. Much smaller change to our code-base, so easier to patch, review > and > > > > test. Lower risk overall. > > > > > > > > Can you walk me over the benefits of using Zookeeper? Especially > since > > it > > > > looks like we can't get rid of the file entirely? > > > > > > > > Gwen > > > > > > > > On Thu, May 7, 2015 at 3:33 AM, Jun Rao <j...@confluent.io> wrote: > > > > > > > > > One of the Chef users confirmed that Chef integration could still > > work > > > if > > > > > all configs are moved to ZK. My rough understanding of how Chef > works > > > is > > > > > that a user first registers a service host with a Chef server. > After > > > > that, > > > > > a Chef client will be run on the service host. The user can then > push > > > > > config changes intended for a service/host to the Chef server. The > > > server > > > > > is then responsible for pushing the changes to Chef clients. Chef > > > clients > > > > > support pluggable logic. For example, it can generate a config file > > > that > > > > > Kafka broker will take. If we move all configs to ZK, we can > > customize > > > > the > > > > > Chef client to use our config CLI to make the config changes in > > Kafka. > > > In > > > > > this model, one probably doesn't need to register every broker in > > Chef > > > > for > > > > > the config push. Not sure if Puppet works in a similar way. > > > > > > > > > > Also for storing the configs, we probably can't store the > > broker/global > > > > > level configs in Kafka itself (e.g. in a special topic). The reason > > is > > > > that > > > > > in order to start a broker, we likely need to make some broker > level > > > > config > > > > > changes (e.g., the default log.dir may not be present, the default > > port > > > > may > > > > > not be available, etc). If we need a broker to be up to make those > > > > changes, > > > > > we get into this chicken and egg problem. > > > > > > > > > > Thanks, > > > > > > > > > > Jun > > > > > > > > > > On Tue, May 5, 2015 at 4:14 PM, Gwen Shapira < > gshap...@cloudera.com> > > > > > wrote: > > > > > > > > > > > Sorry I missed the call today :) > > > > > > > > > > > > I think an additional requirement would be: > > > > > > Make sure that traditional deployment tools (Puppet, Chef, etc) > are > > > > still > > > > > > capable of managing Kafka configuration. > > > > > > > > > > > > For this reason, I'd like the configuration refresh to be pretty > > > close > > > > to > > > > > > what most Linux services are doing to force a reload of > > > configuration. > > > > > > AFAIK, this involves handling HUP signal in the main thread to > > reload > > > > > > configuration. Then packaging scripts can add something nice like > > > > > "service > > > > > > kafka reload". > > > > > > > > > > > > (See Apache web server: > > > > > > > > https://github.com/apache/httpd/blob/trunk/build/rpm/httpd.init#L101 > > > ) > > > > > > > > > > > > Gwen > > > > > > > > > > > > > > > > > > On Tue, May 5, 2015 at 8:54 AM, Joel Koshy <jjkosh...@gmail.com> > > > > wrote: > > > > > > > > > > > > > Good discussion. Since we will be talking about this at 11am, I > > > > wanted > > > > > > > to organize these comments into requirements to see if we are > all > > > on > > > > > > > the same page. > > > > > > > > > > > > > > REQUIREMENT 1: Needs to accept dynamic config changes. This > needs > > > to > > > > > > > be general enough to work for all configs that we envision may > > need > > > > to > > > > > > > accept changes at runtime. e.g., log (topic), broker, client > > > > (quotas), > > > > > > > etc.. possible options include: > > > > > > > - ZooKeeper watcher > > > > > > > - Kafka topic > > > > > > > - Direct RPC to controller (or config coordinator) > > > > > > > > > > > > > > The current KIP is really focused on REQUIREMENT 1 and I think > > that > > > > is > > > > > > > reasonable as long as we don't come up with something that > > requires > > > > > > > significant re-engineering to support the other requirements. > > > > > > > > > > > > > > REQUIREMENT 2: Provide consistency of configs across brokers > > > (modulo > > > > > > > per-broker overrides) or at least be able to verify > consistency. > > > > What > > > > > > > this effectively means is that config changes must be seen by > all > > > > > > > brokers eventually and we should be able to easily compare the > > full > > > > > > > config of each broker. > > > > > > > > > > > > > > REQUIREMENT 3: Central config store. Needs to work with plain > > > > > > > file-based configs and other systems (e.g., puppet). Ideally, > > > should > > > > > > > not bring in other dependencies (e.g., a DB). Possible options: > > > > > > > - ZooKeeper > > > > > > > - Kafka topic > > > > > > > - other? E.g. making it pluggable? > > > > > > > > > > > > > > Any other requirements? > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > Joel > > > > > > > > > > > > > > On Tue, May 05, 2015 at 01:38:09AM +0000, Aditya Auradkar > wrote: > > > > > > > > Hey Neha, > > > > > > > > > > > > > > > > Thanks for the feedback. > > > > > > > > 1. In my earlier exchange with Jay, I mentioned the broker > > > writing > > > > > all > > > > > > > it's configs to ZK (while respecting the overrides). Then ZK > can > > be > > > > > used > > > > > > to > > > > > > > view all configs. > > > > > > > > > > > > > > > > 2. Need to think about this a bit more. Perhaps we can > discuss > > > this > > > > > > > during the hangout tomorrow? > > > > > > > > > > > > > > > > 3 & 4) I viewed these config changes as mainly administrative > > > > > > > operations. In the case, it may be reasonable to assume that > the > > ZK > > > > > port > > > > > > is > > > > > > > available for communication from the machine these commands are > > > run. > > > > > > Having > > > > > > > a ConfigChangeRequest (or similar) is nice to have but having a > > new > > > > API > > > > > > and > > > > > > > sending requests to controller also change how we do topic > based > > > > > > > configuration currently. I was hoping to keep this KIP as > minimal > > > as > > > > > > > possible and provide a means to represent and modify client and > > > > broker > > > > > > > based configs in a central place. Are there any concerns if we > > > tackle > > > > > > these > > > > > > > things in a later KIP? > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Aditya > > > > > > > > ________________________________________ > > > > > > > > From: Neha Narkhede [n...@confluent.io] > > > > > > > > Sent: Sunday, May 03, 2015 9:48 AM > > > > > > > > To: dev@kafka.apache.org > > > > > > > > Subject: Re: [DISCUSS] KIP-21 Configuration Management > > > > > > > > > > > > > > > > Thanks for starting this discussion, Aditya. Few > > > questions/comments > > > > > > > > > > > > > > > > 1. If you change the default values like it's mentioned in > the > > > KIP, > > > > > do > > > > > > > you > > > > > > > > also overwrite the local config file as part of updating the > > > > default > > > > > > > value? > > > > > > > > If not, where does the admin look to find the default values, > > ZK > > > or > > > > > > local > > > > > > > > Kafka config file? What if a config value is different in > both > > > > > places? > > > > > > > > > > > > > > > > 2. I share Gwen's concern around making sure that popular > > config > > > > > > > management > > > > > > > > tools continue to work with this change. Would love to see > how > > > each > > > > > of > > > > > > > > those would work with the proposal in the KIP. I don't know > > > enough > > > > > > about > > > > > > > > each of the tools but seems like in some of the tools, you > have > > > to > > > > > > define > > > > > > > > some sort of class with parameter names as config names. How > > will > > > > > such > > > > > > > > tools find out about the config values? In Puppet, if this > > means > > > > that > > > > > > > each > > > > > > > > Puppet agent has to read it from ZK, this means the ZK port > has > > > to > > > > be > > > > > > > open > > > > > > > > to pretty much every machine in the DC. This is a bummer and > a > > > very > > > > > > > > confusing requirement. Not sure if this is really a problem > or > > > not > > > > > > (each > > > > > > > of > > > > > > > > those tools might behave differently), though pointing out > that > > > > this > > > > > is > > > > > > > > something worth paying attention to. > > > > > > > > > > > > > > > > 3. The wrapper tools that let users read/change config tools > > > should > > > > > not > > > > > > > > depend on ZK for the reason mentioned above. It's a pain to > > > assume > > > > > that > > > > > > > the > > > > > > > > ZK port is open from any machine that needs to run this tool. > > > > Ideally > > > > > > > what > > > > > > > > users want is a REST API to the brokers to change or read the > > > > config > > > > > > (ala > > > > > > > > Elasticsearch), but in the absence of the REST API, we should > > > think > > > > > if > > > > > > we > > > > > > > > can write the tool such that it just requires talking to the > > > Kafka > > > > > > broker > > > > > > > > port. This will require a config RPC. > > > > > > > > > > > > > > > > 4. Not sure if KIP is the right place to discuss the design > of > > > > > > > propagating > > > > > > > > the config changes to the brokers, but have you thought about > > > just > > > > > > > letting > > > > > > > > the controller oversee the config changes and propagate via > RPC > > > to > > > > > the > > > > > > > > brokers? That way, there is an easier way to express config > > > changes > > > > > > that > > > > > > > > require all brokers to change it for it to be called > complete. > > > > Maybe > > > > > > this > > > > > > > > is not required, but it is hard to say if we don't discuss > the > > > full > > > > > set > > > > > > > of > > > > > > > > configs that need to be dynamic. > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Neha > > > > > > > > > > > > > > > > On Fri, May 1, 2015 at 12:53 PM, Jay Kreps < > > jay.kr...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > > > > > Hey Aditya, > > > > > > > > > > > > > > > > > > This is a great! A couple of comments: > > > > > > > > > > > > > > > > > > 1. Leaving the file config in place is definitely the least > > > > > > > disturbance. > > > > > > > > > But let's really think about getting rid of the files and > > just > > > > have > > > > > > one > > > > > > > > > config mechanism. There is always a tendency to make > > everything > > > > > > > pluggable > > > > > > > > > which so often just leads to two mediocre solutions. Can we > > do > > > > the > > > > > > > exercise > > > > > > > > > of trying to consider fully getting rid of file config and > > > seeing > > > > > > what > > > > > > > goes > > > > > > > > > wrong? > > > > > > > > > > > > > > > > > > 2. Do we need to model defaults? The current approach is > that > > > if > > > > > you > > > > > > > have a > > > > > > > > > global config x it is overridden for a topic xyz by > > > > /topics/xyz/x, > > > > > > and > > > > > > > I > > > > > > > > > think this could be extended to /brokers/0/x. I think this > is > > > > > > simpler. > > > > > > > We > > > > > > > > > need to specify the precedence for these overrides, e.g. if > > you > > > > > > > override at > > > > > > > > > the broker and topic level I think the topic level takes > > > > > precedence. > > > > > > > > > > > > > > > > > > 3. I recommend we have the producer and consumer config > just > > be > > > > an > > > > > > > override > > > > > > > > > under client.id. The override is by client id and we can > > have > > > > > > separate > > > > > > > > > properties for controlling quotas for producers and > > consumers. > > > > > > > > > > > > > > > > > > 4. Some configs can be changed just by updating the > > reference, > > > > > others > > > > > > > may > > > > > > > > > require some action. An example of this is if you want to > > > disable > > > > > log > > > > > > > > > compaction (assuming we wanted to make that dynamic) we > need > > to > > > > > call > > > > > > > > > shutdown() on the cleaner. I think it may be required to > > > > register a > > > > > > > > > listener callback that gets called when the config changes. > > > > > > > > > > > > > > > > > > 5. For handling the reference can you explain your plan a > > bit? > > > > > > > Currently we > > > > > > > > > have an immutable KafkaConfig object with a bunch of vals. > > That > > > > or > > > > > > > > > individual values in there get injected all over the code > > > base. I > > > > > was > > > > > > > > > thinking something like this: > > > > > > > > > a. We retain the KafkaConfig object as an immutable object > > just > > > > as > > > > > > > today. > > > > > > > > > b. It is no longer legit to grab values out fo that config > if > > > > they > > > > > > are > > > > > > > > > changeable. > > > > > > > > > c. Instead of making KafkaConfig itself mutable we make > > > > > > > KafkaConfiguration > > > > > > > > > which has a single volatile reference to the current > > > KafkaConfig. > > > > > > > > > KafkaConfiguration is what gets passed into various > > components. > > > > So > > > > > to > > > > > > > > > access a config you do something like > > config.instance.myValue. > > > > When > > > > > > the > > > > > > > > > config changes the config manager updates this reference. > > > > > > > > > d. The KafkaConfiguration is the thing that allows doing > the > > > > > > > > > configuration.onChange("my.config", callback) > > > > > > > > > > > > > > > > > > -Jay > > > > > > > > > > > > > > > > > > On Tue, Apr 28, 2015 at 3:57 PM, Aditya Auradkar < > > > > > > > > > aaurad...@linkedin.com.invalid> wrote: > > > > > > > > > > > > > > > > > > > Hey everyone, > > > > > > > > > > > > > > > > > > > > Wrote up a KIP to update topic, client and broker configs > > > > > > > dynamically via > > > > > > > > > > Zookeeper. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration > > > > > > > > > > > > > > > > > > > > Please read and provide feedback. > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Aditya > > > > > > > > > > > > > > > > > > > > PS: I've intentionally kept this discussion separate from > > > KIP-5 > > > > > > > since I'm > > > > > > > > > > not sure if that is actively being worked on and I wanted > > to > > > > > start > > > > > > > with a > > > > > > > > > > clean slate. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Thanks, > > > > > > > > Neha > > > > > > > > > > > > > > -- > > > > > > > Joel > > > > > > > > > > > > > > > > > > > > > > > > > > > >