inline On Thu, Jan 22, 2015 at 11:59 PM, Jay Kreps <jay.kr...@gmail.com> wrote:
> Hey Joe, > > This is great. A few comments on KIP-4 > > 1. This is much needed functionality, but there are a lot of the so let's > really think these protocols through. We really want to end up with a set > of well thought-out, orthoganol apis. For this reason I think it is really > important to think through the end state even if that includes APIs we > won't implement in the first phase. > ok > > 2. Let's please please please wait until we have switched the server over > to the new java protocol definitions. If we add upteen more ad hoc scala > objects that is just generating more work for the conversion we know we > have to do. > ok :) > > 3. This proposal introduces a new type of optional parameter. This is > inconsistent with everything else in the protocol where we use -1 or some > other marker value. You could argue either way but let's stick with that > for consistency. For clients that implemented the protocol in a better way > than our scala code these basic primitives are hard to change. > yes, less confusing, ok. > > 4. ClusterMetadata: This seems to duplicate TopicMetadataRequest which has > brokers, topics, and partitions. I think we should rename that request > ClusterMetadataRequest (or just MetadataRequest) and include the id of the > controller. Or are there other things we could add here? > We could add broker version to it. > > 5. We have a tendency to try to make a lot of requests that can only go to > particular nodes. This adds a lot of burden for client implementations (it > sounds easy but each discovery can fail in many parts so it ends up being a > full state machine to do right). I think we should consider making admin > commands and ideally as many of the other apis as possible available on all > brokers and just redirect to the controller on the broker side. Perhaps > there would be a general way to encapsulate this re-routing behavior. > If we do that then we should also preserve what we have and do both. The client can then decide "do I want to go to any broker and proxy" or just "go to controller and run admin task". Lots of folks have seen controllers come under distress because of their producers/consumers. There is ticket too for controller elect and re-elect https://issues.apache.org/jira/browse/KAFKA-1778 so you can force it to a broker that has 0 load. > > 6. We should probably normalize the key value pairs used for configs rather > than embedding a new formatting. So two strings rather than one with an > internal equals sign. > ok > > 7. Is the postcondition of these APIs that the command has begun or that > the command has been completed? It is a lot more usable if the command has > been completed so you know that if you create a topic and then publish to > it you won't get an exception about there being no such topic. > We should define that more. There needs to be some more state there, yes. We should try to cover https://issues.apache.org/jira/browse/KAFKA-1125 within what we come up with. > > 8. Describe topic and list topics duplicate a lot of stuff in the metadata > request. Is there a reason to give back topics marked for deletion? I feel > like if we just make the post-condition of the delete command be that the > topic is deleted that will get rid of the need for this right? And it will > be much more intuitive. > I will go back and look through it. > > 9. Should we consider batching these requests? We have generally tried to > allow multiple operations to be batched. My suspicion is that without this > we will get a lot of code that does something like > for(topic: adminClient.listTopics()) > adminClient.describeTopic(topic) > this code will work great when you test on 5 topics but not do as well if > you have 50k. > So => Input is a list of topics (or none for all) and a batch response from the controller (which could be routed through another broker) of the entire response? We could introduce a Batch keyword to explicitly show the usage of it. > 10. I think we should also discuss how we want to expose a programmatic JVM > client api for these operations. Currently people rely on AdminUtils which > is totally sketchy. I think we probably need another client under clients/ > that exposes administrative functionality. We will need this just to > properly test the new apis, I suspect. We should figure out that API. > We were talking about that here https://issues.apache.org/jira/browse/KAFKA-1774 and wrote it in java https://reviews.apache.org/r/29301/diff/7/?page=4#75 so we could do something like that, sure. > > 11. The other information that would be really useful to get would be > information about partitions--how much data is in the partition, what are > the segment offsets, what is the log-end offset (i.e. last offset), what is > the compaction point, etc. I think that done right this would be the > successor to the very awkward OffsetRequest we have today. > yes! > > -Jay > > On Wed, Jan 21, 2015 at 10:27 PM, Joe Stein <joe.st...@stealth.ly> wrote: > > > Hi, created a KIP > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations > > > > JIRA https://issues.apache.org/jira/browse/KAFKA-1694 > > > > /******************************************* > > Joe Stein > > Founder, Principal Consultant > > Big Data Open Source Security LLC > > http://www.stealth.ly > > Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop> > > ********************************************/ > > >