Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations

Joe Stein Thu, 22 Jan 2015 22:11:54 -0800

inline

On Thu, Jan 22, 2015 at 11:59 PM, Jay Kreps <jay.kr...@gmail.com> wrote:


> Hey Joe,
>
> This is great. A few comments on KIP-4
>
> 1. This is much needed functionality, but there are a lot of the so let's
> really think these protocols through. We really want to end up with a set
> of well thought-out, orthoganol apis. For this reason I think it is really
> important to think through the end state even if that includes APIs we
> won't implement in the first phase.
>

ok


>
> 2. Let's please please please wait until we have switched the server over
> to the new java protocol definitions. If we add upteen more ad hoc scala
> objects that is just generating more work for the conversion we know we
> have to do.
>

ok :)


>
> 3. This proposal introduces a new type of optional parameter. This is
> inconsistent with everything else in the protocol where we use -1 or some
> other marker value. You could argue either way but let's stick with that
> for consistency. For clients that implemented the protocol in a better way
> than our scala code these basic primitives are hard to change.
>

yes, less confusing, ok.


>
> 4. ClusterMetadata: This seems to duplicate TopicMetadataRequest which has
> brokers, topics, and partitions. I think we should rename that request
> ClusterMetadataRequest (or just MetadataRequest) and include the id of the
> controller. Or are there other things we could add here?
>

We could add broker version to it.


>
> 5. We have a tendency to try to make a lot of requests that can only go to
> particular nodes. This adds a lot of burden for client implementations (it
> sounds easy but each discovery can fail in many parts so it ends up being a
> full state machine to do right). I think we should consider making admin
> commands and ideally as many of the other apis as possible available on all
> brokers and just redirect to the controller on the broker side. Perhaps
> there would be a general way to encapsulate this re-routing behavior.
>

If we do that then we should also preserve what we have and do both. The
client can then decide "do I want to go to any broker and proxy" or just
"go to controller and run admin task". Lots of folks have seen controllers
come under distress because of their producers/consumers. There is ticket
too for controller elect and re-elect
https://issues.apache.org/jira/browse/KAFKA-1778 so you can force it to a
broker that has 0 load.


>
> 6. We should probably normalize the key value pairs used for configs rather
> than embedding a new formatting. So two strings rather than one with an
> internal equals sign.
>

ok


>
> 7. Is the postcondition of these APIs that the command has begun or that
> the command has been completed? It is a lot more usable if the command has
> been completed so you know that if you create a topic and then publish to
> it you won't get an exception about there being no such topic.
>

We should define that more. There needs to be some more state there, yes.

We should try to cover https://issues.apache.org/jira/browse/KAFKA-1125
within what we come up with.


>
> 8. Describe topic and list topics duplicate a lot of stuff in the metadata
> request. Is there a reason to give back topics marked for deletion? I feel
> like if we just make the post-condition of the delete command be that the
> topic is deleted that will get rid of the need for this right? And it will
> be much more intuitive.
>

I will go back and look through it.


>
> 9. Should we consider batching these requests? We have generally tried to
> allow multiple operations to be batched. My suspicion is that without this
> we will get a lot of code that does something like
>    for(topic: adminClient.listTopics())
>       adminClient.describeTopic(topic)
> this code will work great when you test on 5 topics but not do as well if
> you have 50k.
>

So => Input is a list of topics (or none for all) and a batch response from
the controller (which could be routed through another broker) of the entire
response? We could introduce a Batch keyword to explicitly show the usage
of it.


> 10. I think we should also discuss how we want to expose a programmatic JVM
> client api for these operations. Currently people rely on AdminUtils which
> is totally sketchy. I think we probably need another client under clients/
> that exposes administrative functionality. We will need this just to
> properly test the new apis, I suspect. We should figure out that API.
>

We were talking about that here
https://issues.apache.org/jira/browse/KAFKA-1774 and wrote it in java
https://reviews.apache.org/r/29301/diff/7/?page=4#75 so we could do
something like that, sure.


>
> 11. The other information that would be really useful to get would be
> information about partitions--how much data is in the partition, what are
> the segment offsets, what is the log-end offset (i.e. last offset), what is
> the compaction point, etc. I think that done right this would be the
> successor to the very awkward OffsetRequest we have today.
>

yes!


>
> -Jay
>
> On Wed, Jan 21, 2015 at 10:27 PM, Joe Stein <joe.st...@stealth.ly> wrote:
>
> > Hi, created a KIP
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations
> >
> > JIRA https://issues.apache.org/jira/browse/KAFKA-1694
> >
> > /*******************************************
> >  Joe Stein
> >  Founder, Principal Consultant
> >  Big Data Open Source Security LLC
> >  http://www.stealth.ly
> >  Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
> > ********************************************/
> >
>

Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations

Reply via email to