Yes, to verify if a partition reassignment completes or not, we just need to make sure AR == RAR. So, we don't need ISR for this. It's probably still useful to know ISR for monitoring in general though.
Thanks, Jun On Mon, Apr 27, 2015 at 4:15 AM, Andrii Biletskyi < andrii.bilets...@stealth.ly> wrote: > Okay, I had some doubts in terms of reassign-partitions case. I was > not sure whether we need ISR to check post condition of partition > reassignment. But I think we can rely on assigned replicas - the workflow > in reassignPartitions is the following: > 1. Update AR in ZK with OAR + RAR. > ... > 10. Update AR in ZK with RAR. > 11. Update the /admin/reassign_partitions path in ZK to remove this > partition. > 12. After electing leader, the replicas and isr information changes. So > resend the update metadata request to every broker. > > In other words AR becomes RAR right before removing partitions from the > admin path. I think we can consider (with a little approximation) > reassignment > completed if AR == RAR. > > If it's okay, I will remove ISR and add topic config in one change as > discussed > earlier. > > Thanks, > Andrii Biletskyi > > > On Mon, Apr 27, 2015 at 1:50 AM, Jun Rao <j...@confluent.io> wrote: > > > Andrii, > > > > Another thing. We decided not to add the lag info in TMR. To be > consistent, > > we probably also want to remove ISR from TMR since only the leader knows > > it. We can punt on adding any new request from getting ISR. ISR is mostly > > useful for monitoring. Currently, one can determine if a replica is in > ISR > > from the lag metrics (a replica is in ISR if its lag is <=0). > > > > Thanks, > > > > Jun > > > > On Sun, Apr 26, 2015 at 4:31 PM, Andrii Biletskyi < > > andrii.bilets...@stealth.ly> wrote: > > > > > Jun, > > > > > > I like your approach to AlterTopicReques semantics! Sounds like > > > we linearize all request fields to ReplicaAssignment - I will > definitely > > > try this out to ensure there are no other pitfalls. > > > > > > With regards to multiple instructions in one batch per topic. For me > > > this sounds reasonable too. We discussed last time that it's pretty > > > strange we give users schema that supports batching and at the > > > same time introduce restrictions to the way batching can be used > > > (in this case - only one instruction per topic). But now, when we give > > > users everything they need to avoid such misleading use cases (if > > > we implement the previous item - user will be able to specify/change > > > all fields in one instruction) - it might be a good justification to > > > prohibit > > > serving such requests. > > > > > > Any objections? > > > > > > Thanks, > > > Andrii BIletskyi > > > > > > > > > > > > On Sun, Apr 26, 2015 at 11:00 PM, Jun Rao <j...@confluent.io> wrote: > > > > > > > Andrii, > > > > > > > > Thanks for the update. > > > > > > > > For your second point, I agree that if a single AlterTopicRequest can > > > make > > > > multiple changes, there is no need to support the same topic included > > > more > > > > than once in the request. > > > > > > > > Now about the semantics in your first question. I was thinking that > we > > > can > > > > do the following. > > > > a. If ReplicaAssignment is specified, we expect that this will > specify > > > the > > > > replica assignment for all partitions in the topic. For now, we can > > have > > > > the constraint that there could be more partitions than existing > ones, > > > but > > > > can't be less. In this case, both partitions and replicas are > ignored. > > > Then > > > > for each partition, we do one of the followings. > > > > a1. If the partition doesn't exist, add the partition with the > replica > > > > assignment directly to the topic path in ZK. > > > > a2. If the partition exists and the new replica assignment is not the > > > same > > > > as the existing one, include it in the reassign partition json. If > the > > > json > > > > is not empty, write it to the reassignment path in ZK to trigger > > > partition > > > > reassignment. > > > > b. Otherwise, if replicas is specified, generate new > ReplicaAssignment > > > for > > > > existing partitions. If partitions is specified (assuming it's > larger), > > > > generate ReplicaAssignment for the new partitions as well. Then go > back > > > to > > > > step a to make a decision. > > > > c. Otherwise, if only partitions is specified, add assignments of > > > existing > > > > partitions to ReplicaAssignment. Generate assignments to the new > > > partitions > > > > and add them to ReplicaAssignment. Then go back to step a to make a > > > > decision. > > > > > > > > Thanks, > > > > > > > > Jun > > > > > > > > > > > > > > > > On Sat, Apr 25, 2015 at 7:21 AM, Andrii Biletskyi < > > > > andrii.bilets...@stealth.ly> wrote: > > > > > > > > > Guys, > > > > > > > > > > Can we come to some agreement in terms of the second item from > > > > > the email above? This blocks me from updating and uploading the > > > > > patch. Also the new schedule for the weekly calls doesn't work very > > > > > well for me - it's 1 am in my timezone :) - so I'd rather we > confirm > > > > > everything that is possible by email. > > > > > > > > > > Thanks, > > > > > Andrii Biletskyi > > > > > > > > > > On Wed, Apr 22, 2015 at 5:50 PM, Andrii Biletskyi < > > > > > andrii.bilets...@stealth.ly> wrote: > > > > > > > > > > > As said above, I spent some time thinking about AlterTopicRequest > > > > > > semantics and batching. > > > > > > > > > > > > Firstly, about AlterTopicRequest. Our goal here is to see whether > > we > > > > > > can suggest some simple semantics and at the same time let users > > > > > > change different things in one instruction (hereinafter > > instruction - > > > > is > > > > > > one of the entries in batch request). > > > > > > We can resolve arguments according to this schema: > > > > > > 1) If ReplicaAsignment is specified: > > > > > > it's a reassign partitions request > > > > > > 2) If either Partitions or ReplicationFactor is specified: > > > > > > a) If Partitions specified - this is increase partitions case > > > > > > b) If ReplicationFactor is specified - this means we need to > > > > > > automatically > > > > > > regenerate replica assignment and treat it as reassign > > partitions > > > > > > request > > > > > > Note: this algorithm is a bit inconsistent with the > > > CreateTopicRequest > > > > - > > > > > > with > > > > > > ReplicaAssignment specified there user can implicitly define > > > Partitions > > > > > > and > > > > > > ReplicationFactor, in AlterTopicRequest those are completely > > > different > > > > > > things, > > > > > > i.e. you can't include new partitions to the ReplicaAssignment to > > > > > > implicitly ask > > > > > > controller to increase partitions - controller will simply return > > > > > > InvalidReplicaAssignment, > > > > > > because you included unknown partitions. > > > > > > > > > > > > Secondly, multiple instructions for one topic in batch request. I > > > have > > > > a > > > > > > feeling > > > > > > it becomes a really big mess now, so suggestions are highly > > > appreciated > > > > > > here! > > > > > > Our goal is to consider whether we can let users add multiple > > > > > instructions > > > > > > for one topic in one batch but at the same time make it > transparent > > > > > enough > > > > > > so > > > > > > we can support blocking on request completion, for that we need > to > > > > > analyze > > > > > > from the request what is the final expected state of the topic. > > > > > > And the latter one seems to me a tough issue. > > > > > > Consider the following AlterTopicRequest: > > > > > > [1) topic1: change ReplicationFactor from 2 to 3, > > > > > > 2) topic1: change ReplicaAssignment (taking into account RF is 3 > > > now), > > > > > > 3) topic2: change ReplicaAssignment (just to include multiple > > > topics) > > > > > > 4) topic1: change ReplicationFactor from 3 to 1, > > > > > > 5) topic1: change ReplicaAssignment again (taking into account > RF > > > is 1 > > > > > > now) > > > > > > ] > > > > > > As we discussed earlier, controller will handle it as alter-topic > > > > command > > > > > > and > > > > > > reassign-partitions. First of all, it will scan all > > ReplicaAssignment > > > > and > > > > > > assembly > > > > > > those to one json to create admin path /reassign_partitions once > > > > needed. > > > > > > Now, user would expect we execute instruction sequentially, but > we > > > > can't > > > > > > do it > > > > > > because only one reassign-partitions procedure can be in > progress - > > > > when > > > > > > should we trigger reassign-partition - after 1) or after 4)? And > > what > > > > > > about topic2 - > > > > > > we will break the order, but it was supposed we execute > > instructions > > > > > > sequentially. > > > > > > Overall, the logic seems to be very sophisticated, which is a bad > > > sign. > > > > > > Conceptually, > > > > > > I think the root problem is that we imply there is an order in > > > > sequential > > > > > > instructions, > > > > > > but instructions themselves are asynchronous, so really you can't > > > > > > guarantee any order. > > > > > > I'm thinking about such solutions now: > > > > > > 1) Prohibit multiple instructions (this seems reasonable if we > let > > > > users > > > > > > change multiple > > > > > > things in scope of now instructions - see the first item) > > > > > > 2) Break apart again AlterTopic and ReassignPartitions - if the > > > > > > reassignment case is > > > > > > the only problem here, which I'm not sure about. > > > > > > > > > > > > Thoughts? > > > > > > > > > > > > Thanks, > > > > > > Andrii Biletskyi > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Apr 22, 2015 at 2:59 AM, Andrii Biletskyi < > > > > > > andrii.bilets...@stealth.ly> wrote: > > > > > > > > > > > >> Guys, > > > > > >> > > > > > >> Thank you for your time. A short summary of our discussion. > > > > > >> Answering previous items: > > > > > >> > > > > > >> 1. 2. I will double check existing error codes to align the list > > of > > > > > >> errors that needs to be added. > > > > > >> > > > > > >> 3. We agreed to think again about the batch requests semantics. > > > > > >> The main concern is that users would expect we allow executing > > > > > >> multiple instructions for one topic in one batch. > > > > > >> I will start implementation and check whether there are any > > > > impediments > > > > > >> to handle it this way. > > > > > >> > > > > > >> The same for AlterTopicRequest - I will try to make request > > > semantics > > > > > >> as easy as possible and allow users change different things at > one > > > > > >> time - e.g. change nr of partitions and replicas in one > > instruction. > > > > > >> > > > > > >> 4. We agreed not to add to TMR lag information. > > > > > >> > > > > > >> 5. We discussed preferred replica command and it was pointed out > > > > > >> that generally users shouldn't call this command manually now > > since > > > > > >> this is automatically handled by the cluster. > > > > > >> If there are no objections (especially from devops people) I > will > > > > remove > > > > > >> respective request. > > > > > >> > > > > > >> 6. As discussed AdminClient API is a phase 2 and will go after > > Wire > > > > > >> Protocol extensions. It will be finalized as java-doc after I > > > complete > > > > > >> patch for phase 1 - Wire Protocol + server-side code handling > > > > requests. > > > > > >> > > > > > >> Thanks, > > > > > >> Andrii Biletskyi > > > > > >> > > > > > >> On Wed, Apr 22, 2015 at 12:36 AM, Jay Kreps < > jay.kr...@gmail.com> > > > > > wrote: > > > > > >> > > > > > >>> Hey Andrii, thanks for all the hard work on this, it has come a > > > long > > > > > way. > > > > > >>> > > > > > >>> A couple questions and comments on this. > > > > > >>> > > > > > >>> For the errors, can we do the following: > > > > > >>> 1. Remove IllegalArgument from the name, we haven't used that > > > > > convention > > > > > >>> for other errors. > > > > > >>> 2. Normalize this list with the existing errors. For example, > > > > elsewhere > > > > > >>> when you give an invalid topic name we give back an > > > > > InvalidTopicException > > > > > >>> but this is proposing a new error for that. It would be good > that > > > > these > > > > > >>> kinds of errors are handled the same way across all requests in > > the > > > > > >>> protocol. > > > > > >>> > > > > > >>> Other comments: > > > > > >>> 3. I don't understand MultipleInstructionsForOneTopic > > > > > >>> and MultipleTopicInstructionsInOneBatch and the description is > > > quite > > > > > >>> vague. > > > > > >>> There is some implicit assumption in this proposal about how > > > batching > > > > > >>> will > > > > > >>> be done that doesn't seem to be explained. > > > > > >>> 4. I think adding replica lag to the metadata request is out of > > > place > > > > > and > > > > > >>> should not be in the metadata request. Two reasons: a. This is > > > > > something > > > > > >>> that can only be answered by the leader for that partition. So > > > > > querying N > > > > > >>> partitions fundamentally mean querying N brokers (roughly). > This > > is > > > > > >>> different from the other properties which are "shared > knowledge". > > > > > >>> b. This is a monitoring property not a configuration/metadata > > > > > property. I > > > > > >>> recommend we remove this here and in the future add an API that > > > gets > > > > > all > > > > > >>> the monitoring stats from the server including lag. Adding all > > > these > > > > to > > > > > >>> the > > > > > >>> metadata request won't make sense, right? > > > > > >>> 5. This includes a special request for preferred replica leader > > > > > >>> election. I > > > > > >>> feel that we should not expose an API for this because the user > > > > should > > > > > >>> not > > > > > >>> be in the business of managing leaders. We have gotten this > > feature > > > > to > > > > > >>> the > > > > > >>> point where preferred leadership election is enabled > > > automatically. I > > > > > >>> think > > > > > >>> we should go further in that direction and do whatever work is > > > > required > > > > > >>> to > > > > > >>> make this the only option rather than trying to institute > public > > > apis > > > > > for > > > > > >>> manually controlling it. > > > > > >>> 6. The API changes we discussed for the java api still aren't > > > > reflected > > > > > >>> in > > > > > >>> the proposal. > > > > > >>> > > > > > >>> -Jay > > > > > >>> > > > > > >>> On Tue, Apr 21, 2015 at 7:47 AM, Andrii Biletskyi < > > > > > >>> andrii.bilets...@stealth.ly> wrote: > > > > > >>> > > > > > >>> > Hi all, > > > > > >>> > > > > > > >>> > I've updated KIP-4 page to include all previously discussed > > items > > > > > such > > > > > >>> as: > > > > > >>> > new error codes, merged alter-topic and reassign-partitions > > > > requests, > > > > > >>> added > > > > > >>> > TMR_V1. > > > > > >>> > > > > > > >>> > It'd be great if we concentrate on the Errors+Wire Protocol > > > schema > > > > > and > > > > > >>> > discuss > > > > > >>> > any remaining issues today, since first patch will include > only > > > > > >>> server-side > > > > > >>> > implementation. > > > > > >>> > > > > > > >>> > Thanks, > > > > > >>> > Andrii Biletskyi > > > > > >>> > > > > > > >>> > > > > > > >>> > On Tue, Apr 21, 2015 at 9:46 AM, Andrii Biletskyi < > > > > > >>> > andrii.bilets...@stealth.ly> wrote: > > > > > >>> > > > > > > >>> > > 1. Yes, this will be much easier. Okay, let's add it. > > > > > >>> > > > > > > > >>> > > 2, Okay. This will differ a little bit from the way > currently > > > > > >>> > > kafka-topics.sh handles alter-topic command, but I think > it's > > > > > >>> > > a reasonable restriction. > > > > > >>> > > > > > > > >>> > > I'll update KIP acordingly to our weekly call. > > > > > >>> > > > > > > > >>> > > Thanks, > > > > > >>> > > Andrii Biletskyi > > > > > >>> > > > > > > > >>> > > On Mon, Apr 20, 2015 at 10:56 PM, Jun Rao < > j...@confluent.io> > > > > > wrote: > > > > > >>> > > > > > > > >>> > >> 1. Yes, lag is probably only going to be useful for the > > admin > > > > > >>> client. > > > > > >>> > >> However, so is isr. It seems to me that we should get lag > > and > > > > isr > > > > > >>> from > > > > > >>> > the > > > > > >>> > >> same request. I was thinking that we can just extend TMR > by > > > > > changing > > > > > >>> > >> replicas from an array of int to an array of (int, lag) > > pairs. > > > > Is > > > > > >>> that > > > > > >>> > too > > > > > >>> > >> complicated? > > > > > >>> > >> > > > > > >>> > >> 3. I was thinking that we just don't allow the cli to > change > > > > more > > > > > >>> than > > > > > >>> > one > > > > > >>> > >> thing at a time. So, you will get an error if you want to > > > change > > > > > >>> both > > > > > >>> > >> partitions and configs. > > > > > >>> > >> > > > > > >>> > >> Thanks, > > > > > >>> > >> > > > > > >>> > >> Jun > > > > > >>> > >> > > > > > >>> > >> On Sun, Apr 19, 2015 at 8:22 AM, Andrii Biletskyi < > > > > > >>> > >> andrii.bilets...@stealth.ly> wrote: > > > > > >>> > >> > > > > > >>> > >> > Jun, > > > > > >>> > >> > > > > > > >>> > >> > 1. Yes, seems we can add lag info to the TMR. But before > > > that > > > > I > > > > > >>> wonder > > > > > >>> > >> > whether there are other reasons we need this info except > > for > > > > > >>> reassign > > > > > >>> > >> > partition command? As we discussed earlier the problem > > with > > > > poor > > > > > >>> > >> > monitoring capabilities for reassign-partitions (as > > > currently > > > > we > > > > > >>> only > > > > > >>> > >> > inform > > > > > >>> > >> > users Completed/In Progress per partition) may require > > > > separate > > > > > >>> > >> solution. > > > > > >>> > >> > We were thinking about separate Wire protocol request. > > And I > > > > > >>> actually > > > > > >>> > >> like > > > > > >>> > >> > your > > > > > >>> > >> > idea about adding some sort of BrokerMetadataRequest for > > > these > > > > > >>> > purposes. > > > > > >>> > >> > I actually think we can cover some other items (like > > > > > >>> rack-awareness) > > > > > >>> > but > > > > > >>> > >> > for > > > > > >>> > >> > me it deserves a separate KIP really. > > > > > >>> > >> > Also, adding Replica->Lag map per partition will make > > > > > >>> > >> TopicMetadataResponse > > > > > >>> > >> > very sophisticated: > > > > > >>> > >> > Map[TopicName, Map[PartitionId, Map[ReplicaId, Lag]]. > > > > > >>> > >> > Maybe we need to leave it for a moment and propose new > > > request > > > > > >>> rather > > > > > >>> > >> than > > > > > >>> > >> > making a new step towards one "monster" request. > > > > > >>> > >> > > > > > > >>> > >> > 2. Yes, error per topic. > > > > > >>> > >> > The only question is whether we should execute at least > > the > > > > very > > > > > >>> first > > > > > >>> > >> > alter topic > > > > > >>> > >> > command from the "duplicated" topic set or return error > > for > > > > all > > > > > >>> ... I > > > > > >>> > >> think > > > > > >>> > >> > the more > > > > > >>> > >> > predictable and reasonable option for clients would be > > > > returning > > > > > >>> > errors > > > > > >>> > >> for > > > > > >>> > >> > all > > > > > >>> > >> > duplicated topics. > > > > > >>> > >> > > > > > > >>> > >> > 3. Hm, yes. Actually we also have "change topic config" > > > there. > > > > > >>> But it > > > > > >>> > is > > > > > >>> > >> > not > > > > > >>> > >> > related to such "replication" commands as increase > > replicas > > > or > > > > > >>> change > > > > > >>> > >> > replica > > > > > >>> > >> > assignment. > > > > > >>> > >> > This will make CLI implementation a bit strange: if user > > > > > specifies > > > > > >>> > >> increase > > > > > >>> > >> > partitions and change topic config in one line - taking > > into > > > > > >>> account > > > > > >>> > 2. > > > > > >>> > >> we > > > > > >>> > >> > will have > > > > > >>> > >> > to create two separate alter topic requests, which were > > > > designed > > > > > >>> as > > > > > >>> > >> batch > > > > > >>> > >> > requests :), > > > > > >>> > >> > but probably we can live with it. > > > > > >>> > >> > Okay, I will think about a separate error code to cover > > such > > > > > >>> cases. > > > > > >>> > >> > > > > > > >>> > >> > 4. We will need InvalidArgumentTopic (e.g. contains > > > prohibited > > > > > >>> chars), > > > > > >>> > >> > IAPartitions, IAReplicas, IAReplicaAssignment, > > > > > >>> IATopicConfiguration. > > > > > >>> > >> > A server side implementation will be a little bit messy > > > (like > > > > > >>> dozens > > > > > >>> > "if > > > > > >>> > >> > this then this > > > > > >>> > >> > error code") but maybe we should think about clients at > > the > > > > > first > > > > > >>> > place > > > > > >>> > >> > here. > > > > > >>> > >> > > > > > > >>> > >> > Thanks, > > > > > >>> > >> > Andrii Biletskyi > > > > > >>> > >> > > > > > > >>> > >> > On Fri, Apr 17, 2015 at 1:46 AM, Jun Rao < > > j...@confluent.io> > > > > > >>> wrote: > > > > > >>> > >> > > > > > > >>> > >> > > 1. For the lags, we can add a new field "lags" per > > > > partition. > > > > > It > > > > > >>> > will > > > > > >>> > >> > > return for each replica that's not in isr, the replica > > id > > > > and > > > > > >>> the > > > > > >>> > lag > > > > > >>> > >> in > > > > > >>> > >> > > messages. Also, if TMR is sent to a non-leader, the > > > response > > > > > can > > > > > >>> > just > > > > > >>> > >> > > include an empty array for isr and lags. > > > > > >>> > >> > > > > > > > >>> > >> > > 2. So, we will just return a topic level error for the > > > > > >>> duplicated > > > > > >>> > >> topics, > > > > > >>> > >> > > right? > > > > > >>> > >> > > > > > > > >>> > >> > > 3. Yes, it's true that today, one can specify both > > > > partitions > > > > > >>> and > > > > > >>> > >> > > replicaAssignment in the TopicCommand. However, > > partitions > > > > is > > > > > >>> > actually > > > > > >>> > >> > > ignored. So, it will be clearer if we don't allow > users > > to > > > > do > > > > > >>> this. > > > > > >>> > >> > > > > > > > >>> > >> > > 4. How many specific error codes like > InvalidPartitions > > > and > > > > > >>> > >> > InvalidReplicas > > > > > >>> > >> > > are needed? If it's not that many, giving out more > > > specific > > > > > >>> error > > > > > >>> > >> will be > > > > > >>> > >> > > useful for non-java clients. > > > > > >>> > >> > > > > > > > >>> > >> > > Thanks, > > > > > >>> > >> > > > > > > > >>> > >> > > Jun > > > > > >>> > >> > > > > > > > >>> > >> > > > > > > > >>> > >> > > On Wed, Apr 15, 2015 at 10:23 AM, Andrii Biletskyi < > > > > > >>> > >> > > andrii.bilets...@stealth.ly> wrote: > > > > > >>> > >> > > > > > > > >>> > >> > > > Guys, > > > > > >>> > >> > > > > > > > > >>> > >> > > > Thanks for the discussion! > > > > > >>> > >> > > > > > > > > >>> > >> > > > Summary: > > > > > >>> > >> > > > > > > > > >>> > >> > > > 1. Q: How KAFKA-1367 (isr is inconsistent in > brokers' > > > > > metadata > > > > > >>> > >> cache) > > > > > >>> > >> > can > > > > > >>> > >> > > > affect implementation? > > > > > >>> > >> > > > A: We can fix this issue for the leading broker > - > > > > > >>> > ReplicaManager > > > > > >>> > >> > (or > > > > > >>> > >> > > > Partition) > > > > > >>> > >> > > > component should have accurate isr list, > then > > > with > > > > > >>> leading > > > > > >>> > >> > broker > > > > > >>> > >> > > > having correct > > > > > >>> > >> > > > info, to do a describe-topic we will need to > > > > define > > > > > >>> > leading > > > > > >>> > >> > > brokers > > > > > >>> > >> > > > for partitions > > > > > >>> > >> > > > and ask those for a correct isr list. > > > > > >>> > >> > > > Also, we should consider adding lag > > information > > > to > > > > > >>> TMR for > > > > > >>> > >> each > > > > > >>> > >> > > > follower for > > > > > >>> > >> > > > partition reassignment, as Jun suggested > > above. > > > > > >>> > >> > > > > > > > > >>> > >> > > > 2. Q: What if user adds different alter commands for > > the > > > > > same > > > > > >>> > topic > > > > > >>> > >> in > > > > > >>> > >> > > > scope > > > > > >>> > >> > > > of one batch request? > > > > > >>> > >> > > > A: Because of the async nature of > > AlterTopicRequest > > > it > > > > > >>> will be > > > > > >>> > >> very > > > > > >>> > >> > > > hard then > > > > > >>> > >> > > > to "assemble" the expected (in terms of > > checking > > > > > >>> whether > > > > > >>> > >> > request > > > > > >>> > >> > > is > > > > > >>> > >> > > > complete) > > > > > >>> > >> > > > result if we let users do this. Also it will > > be > > > > very > > > > > >>> > >> confusing. > > > > > >>> > >> > > It > > > > > >>> > >> > > > was proposed not to > > > > > >>> > >> > > > let users do this (probably add new Error > for > > > such > > > > > >>> cases). > > > > > >>> > >> > > > > > > > > >>> > >> > > > 3. Q: AlterTopicRequest semantics: now when we > merged > > > > > >>> AlterTopic > > > > > >>> > and > > > > > >>> > >> > > > ReassingPartitons in which order AlterTopic > > > fields > > > > > >>> should > > > > > >>> > be > > > > > >>> > >> > > > resolved? > > > > > >>> > >> > > > A: This item is not clear. There was a proposal > to > > > let > > > > > >>> user > > > > > >>> > >> change > > > > > >>> > >> > > only > > > > > >>> > >> > > > one thing at a time, e.g. specify either new > > > > > >>> Replicas, or > > > > > >>> > >> > > > ReplicaAssignment. > > > > > >>> > >> > > > This can be a simple solution, but it's a > very > > > > > strict > > > > > >>> > rule. > > > > > >>> > >> > E.g. > > > > > >>> > >> > > > currently with > > > > > >>> > >> > > > TopicCommand user can increase nr of > > partitions > > > > and > > > > > >>> define > > > > > >>> > >> > > replica > > > > > >>> > >> > > > assignment > > > > > >>> > >> > > > for newly added partitions. Taking into > > account > > > > item > > > > > >>> 2. > > > > > >>> > this > > > > > >>> > >> > will > > > > > >>> > >> > > > be even harder > > > > > >>> > >> > > > for user to achieve this. > > > > > >>> > >> > > > > > > > > >>> > >> > > > 4. Q: Do we need such accurate errors returned from > > the > > > > > >>> server: > > > > > >>> > >> > > > InvalidArgumentPartitions, > > > > > >>> > >> > > > InvalidArgumentReplicas etc. > > > > > >>> > >> > > > A: I started implementation to add proposed > error > > > > codes > > > > > >>> and > > > > > >>> > now > > > > > >>> > >> I > > > > > >>> > >> > > think > > > > > >>> > >> > > > probably > > > > > >>> > >> > > > InvalidArgumentError should be sufficient. > We > > > can > > > > do > > > > > >>> > simple > > > > > >>> > >> > > > validations on > > > > > >>> > >> > > > the client side (e.g. AdminClient can ensure > > nr > > > of > > > > > >>> > >> partitions > > > > > >>> > >> > > > argument is positive), > > > > > >>> > >> > > > others - which can be covered only on server > > > > > (probably > > > > > >>> > >> invalid > > > > > >>> > >> > > > topic config, > > > > > >>> > >> > > > replica assignment includes dead broker > etc) - > > > > will > > > > > be > > > > > >>> > done > > > > > >>> > >> on > > > > > >>> > >> > > > server, and in case > > > > > >>> > >> > > > of invalid argument we will return > > > > > >>> InvalidArgumentError > > > > > >>> > >> without > > > > > >>> > >> > > > specifying the > > > > > >>> > >> > > > concrete field. > > > > > >>> > >> > > > > > > > > >>> > >> > > > It'd be great if we could cover these remaining > > issues, > > > > > looks > > > > > >>> like > > > > > >>> > >> they > > > > > >>> > >> > > are > > > > > >>> > >> > > > minor, > > > > > >>> > >> > > > at least related to specific messages, not the > overall > > > > > >>> protocol. > > > > > >>> > - I > > > > > >>> > >> > > think > > > > > >>> > >> > > > with that I can > > > > > >>> > >> > > > update confluence page and update patch to reflect > all > > > > > >>> discussed > > > > > >>> > >> items. > > > > > >>> > >> > > > This patch > > > > > >>> > >> > > > will probably include Wire protocol messages and > > > > server-side > > > > > >>> code > > > > > >>> > to > > > > > >>> > >> > > handle > > > > > >>> > >> > > > new > > > > > >>> > >> > > > requests. AdminClient and cli-tool implementation > can > > be > > > > the > > > > > >>> next > > > > > >>> > >> step. > > > > > >>> > >> > > > > > > > > >>> > >> > > > Thanks, > > > > > >>> > >> > > > Andrii Biletskyi > > > > > >>> > >> > > > > > > > > >>> > >> > > > On Wed, Apr 15, 2015 at 7:26 PM, Jun Rao < > > > > j...@confluent.io> > > > > > >>> > wrote: > > > > > >>> > >> > > > > > > > > >>> > >> > > > > Andrii, > > > > > >>> > >> > > > > > > > > > >>> > >> > > > > 500. I think what you suggested also sounds > > > reasonable. > > > > > >>> Since > > > > > >>> > ISR > > > > > >>> > >> is > > > > > >>> > >> > > only > > > > > >>> > >> > > > > maintained accurately at the leader, TMR can > return > > > ISR > > > > if > > > > > >>> the > > > > > >>> > >> broker > > > > > >>> > >> > > is > > > > > >>> > >> > > > > the leader of a partition. Otherwise, we can > return > > an > > > > > empty > > > > > >>> > ISR. > > > > > >>> > >> For > > > > > >>> > >> > > > > partition reassignment, it would be useful to know > > the > > > > lag > > > > > >>> of > > > > > >>> > each > > > > > >>> > >> > > > > follower. Again, the leader knows this info. We > can > > > > > probably > > > > > >>> > >> include > > > > > >>> > >> > > that > > > > > >>> > >> > > > > info in TMR as well. > > > > > >>> > >> > > > > > > > > > >>> > >> > > > > 300. I think it's probably reasonable to restrict > > > > > >>> > >> AlterTopicRequest > > > > > >>> > >> > to > > > > > >>> > >> > > > > change only one thing at a time, i.e., either > > > > partitions, > > > > > >>> > >> replicas, > > > > > >>> > >> > > > replica > > > > > >>> > >> > > > > assignment or config. > > > > > >>> > >> > > > > > > > > > >>> > >> > > > > Thanks, > > > > > >>> > >> > > > > > > > > > >>> > >> > > > > Jun > > > > > >>> > >> > > > > > > > > > >>> > >> > > > > On Mon, Apr 13, 2015 at 10:56 AM, Andrii > Biletskyi < > > > > > >>> > >> > > > > andrii.bilets...@stealth.ly> wrote: > > > > > >>> > >> > > > > > > > > > >>> > >> > > > > > Jun, > > > > > >>> > >> > > > > > > > > > > >>> > >> > > > > > 404. Great, thanks! > > > > > >>> > >> > > > > > > > > > > >>> > >> > > > > > 500. If I understand correctly KAFKA-1367 says > ISR > > > > part > > > > > >>> of TMR > > > > > >>> > >> may > > > > > >>> > >> > > > > > be inconsistent. If so, then I believe all admin > > > > > commands > > > > > >>> but > > > > > >>> > >> > > > > describeTopic > > > > > >>> > >> > > > > > are not affected. Let me emphasize that it's > about > > > > > >>> AdminClient > > > > > >>> > >> > > > > operations, > > > > > >>> > >> > > > > > not about Wire Protocol requests. What I mean: > > > > > >>> > >> > > > > > To verify AdminClient.createTopic we will need > > > > > >>> (consistent) > > > > > >>> > >> > 'topics' > > > > > >>> > >> > > > set > > > > > >>> > >> > > > > > from TMR (we don't need isr) > > > > > >>> > >> > > > > > To verify alterTopic - again, probably 'topics' > > and > > > > > >>> 'assigned > > > > > >>> > >> > > > replicas' + > > > > > >>> > >> > > > > > configs > > > > > >>> > >> > > > > > To verify deleteTopic - only 'topics' > > > > > >>> > >> > > > > > To verify preferredReplica - 'leader', 'assigned > > > > > replicas' > > > > > >>> > >> > > > > > To verify reassignPartitions - 'assigned > > replicas' ? > > > > > (I'm > > > > > >>> not > > > > > >>> > >> sure > > > > > >>> > >> > > > about > > > > > >>> > >> > > > > > this one) > > > > > >>> > >> > > > > > If everything above is correct, then > > > > > >>> AdminClient.describeTopic > > > > > >>> > >> is > > > > > >>> > >> > the > > > > > >>> > >> > > > > only > > > > > >>> > >> > > > > > command under risk. We can actually workaround > it > > - > > > > find > > > > > >>> out > > > > > >>> > the > > > > > >>> > >> > > leader > > > > > >>> > >> > > > > > broker > > > > > >>> > >> > > > > > and ask TMR that leading broker to get > up-to-date > > > isr > > > > > >>> list. > > > > > >>> > >> > > > > > Bottom line: looks like 1367 is a separate > issue, > > > and > > > > is > > > > > >>> not a > > > > > >>> > >> > > blocker > > > > > >>> > >> > > > > for > > > > > >>> > >> > > > > > this > > > > > >>> > >> > > > > > KIP. I'm a bit concerned about adding new > requests > > > as > > > > a > > > > > >>> > >> must-have > > > > > >>> > >> > > part > > > > > >>> > >> > > > > > of this KIP when we don't know what we want to > > > include > > > > > to > > > > > >>> > those > > > > > >>> > >> > > > requests. > > > > > >>> > >> > > > > > > > > > > >>> > >> > > > > > Also, I'd like to write down the new > > > AlterTopicRequest > > > > > >>> > semantics > > > > > >>> > >> > (if > > > > > >>> > >> > > we > > > > > >>> > >> > > > > > decide > > > > > >>> > >> > > > > > to include replicas there and merge it with > > > > > >>> > >> > > ReassignPartitionsRequest) > > > > > >>> > >> > > > > > 300. AlterTopicRequest => [TopicName Partitions > > > > Replicas > > > > > >>> > >> > > > > ReplicaAssignment > > > > > >>> > >> > > > > > [AddedConfigEntry] [DeletedConfig]] > > > > > >>> > >> > > > > > The fields are resolved in this sequence: > > > > > >>> > >> > > > > > 1. Either partition or replicas is defined: > > > > > >>> > >> > > > > > ---1.1. ReplicaAssignment is not defined - > > generate > > > > > >>> automatic > > > > > >>> > >> > replica > > > > > >>> > >> > > > > > assignment > > > > > >>> > >> > > > > > for newly added partitions or for > > replicas > > > > > >>> parameter > > > > > >>> > >> > > > increased > > > > > >>> > >> > > > > > ---1.2. ReplicaAssignment is defined - increase > > > topic > > > > > >>> > >> partitions if > > > > > >>> > >> > > > > > 'partitions' defined, > > > > > >>> > >> > > > > > reassign partitions according to > > > > > >>> ReplicaAssignment > > > > > >>> > >> > > > > > 2. Neither partition nor replicas is defined: > > > > > >>> > >> > > > > > ---2.1. ReplicaAssignment is defined - it's a > > > reassign > > > > > >>> > replicas > > > > > >>> > >> > > request > > > > > >>> > >> > > > > > ---2.2. ReplicaAssignment is not defined - just > > > change > > > > > >>> topic > > > > > >>> > >> > configs > > > > > >>> > >> > > > > > 3. Config fields are handled always and > > > independently > > > > > from > > > > > >>> > >> > > > > > partitions+replicas/replicaAssingment > > > > > >>> > >> > > > > > A bit sophisticated, but should cover all cases. > > > > Another > > > > > >>> > option > > > > > >>> > >> - > > > > > >>> > >> > we > > > > > >>> > >> > > > can > > > > > >>> > >> > > > > > say you can define either partitions+replicas or > > > > > >>> > >> replicaAssignment. > > > > > >>> > >> > > > > > > > > > > >>> > >> > > > > > 300.5. There is also a new question related to > > > > > >>> > >> AlterTopicRequest - > > > > > >>> > >> > > > should > > > > > >>> > >> > > > > > we > > > > > >>> > >> > > > > > allow users multiple alter-topic instructions > for > > > one > > > > > >>> topic in > > > > > >>> > >> one > > > > > >>> > >> > > > batch? > > > > > >>> > >> > > > > > I think if we go this way, user will expect we > > > > optimize > > > > > >>> and > > > > > >>> > >> group > > > > > >>> > >> > > > > requests > > > > > >>> > >> > > > > > for one topic, but it will add a lot of burden, > > > > > especially > > > > > >>> > taken > > > > > >>> > >> > into > > > > > >>> > >> > > > > > account > > > > > >>> > >> > > > > > async semantics of the AlterTopicRequest. I'd > > rather > > > > > >>> return > > > > > >>> > some > > > > > >>> > >> > > error > > > > > >>> > >> > > > > > code, > > > > > >>> > >> > > > > > or ignore all but first. Thoughts? > > > > > >>> > >> > > > > > > > > > > >>> > >> > > > > > Thanks, > > > > > >>> > >> > > > > > Andrii Biletskyi > > > > > >>> > >> > > > > > > > > > > >>> > >> > > > > > On Mon, Apr 13, 2015 at 6:34 AM, Jun Rao < > > > > > >>> j...@confluent.io> > > > > > >>> > >> wrote: > > > > > >>> > >> > > > > > > > > > > >>> > >> > > > > > > Andrii, > > > > > >>> > >> > > > > > > > > > > > >>> > >> > > > > > > 404. Jay and I chatted a bit. We agreed to > leave > > > > > >>> > >> > createTopicRequest > > > > > >>> > >> > > > as > > > > > >>> > >> > > > > > > async for now. > > > > > >>> > >> > > > > > > > > > > > >>> > >> > > > > > > There is another thing. > > > > > >>> > >> > > > > > > > > > > > >>> > >> > > > > > > 500. Currently, we have this issue > (KAFKA-1367) > > > that > > > > > >>> the ISR > > > > > >>> > >> in > > > > > >>> > >> > the > > > > > >>> > >> > > > > > > metadata cache can be out of sync. The reason > is > > > > that > > > > > >>> ISR is > > > > > >>> > >> > really > > > > > >>> > >> > > > > > > maintained at the leader. We can potentially > > add a > > > > new > > > > > >>> > >> > > > > BrokerMetaRequest, > > > > > >>> > >> > > > > > > which will return useful stats specific to a > > > broker. > > > > > >>> Such > > > > > >>> > >> stats > > > > > >>> > >> > can > > > > > >>> > >> > > > > > include > > > > > >>> > >> > > > > > > (1) for each partition whose leader is on this > > > > broker, > > > > > >>> the > > > > > >>> > ISR > > > > > >>> > >> > and > > > > > >>> > >> > > > the > > > > > >>> > >> > > > > > lag > > > > > >>> > >> > > > > > > (in messages) for each of the followers, (2) > > space > > > > > used > > > > > >>> per > > > > > >>> > >> > > > partition, > > > > > >>> > >> > > > > > (3) > > > > > >>> > >> > > > > > > remaining space per log dir (not sure how easy > > it > > > is > > > > > to > > > > > >>> get > > > > > >>> > >> this > > > > > >>> > >> > > > info). > > > > > >>> > >> > > > > > If > > > > > >>> > >> > > > > > > we have this new request, we can probably > remove > > > the > > > > > ISR > > > > > >>> > part > > > > > >>> > >> > from > > > > > >>> > >> > > > TMR > > > > > >>> > >> > > > > > v1. > > > > > >>> > >> > > > > > > Currently, the producer/consumer client don't > > > really > > > > > >>> care > > > > > >>> > >> about > > > > > >>> > >> > > ISR. > > > > > >>> > >> > > > > The > > > > > >>> > >> > > > > > > admin client will then issue BrokerMetaRequest > > to > > > > find > > > > > >>> out > > > > > >>> > ISR > > > > > >>> > >> > and > > > > > >>> > >> > > > > other > > > > > >>> > >> > > > > > > stats. > > > > > >>> > >> > > > > > > > > > > > >>> > >> > > > > > > Thanks, > > > > > >>> > >> > > > > > > > > > > > >>> > >> > > > > > > Jun > > > > > >>> > >> > > > > > > > > > > > >>> > >> > > > > > > > > > > > >>> > >> > > > > > > On Tue, Apr 7, 2015 at 12:10 PM, Andrii > > Biletskyi > > > < > > > > > >>> > >> > > > > > > andrii.bilets...@stealth.ly> wrote: > > > > > >>> > >> > > > > > > > > > > > >>> > >> > > > > > > > Hi all, > > > > > >>> > >> > > > > > > > > > > > > >>> > >> > > > > > > > A summary of our discussion: > > > > > >>> > >> > > > > > > > > > > > > >>> > >> > > > > > > > 201. Q: Cluster updates in backward > compatible > > > > way. > > > > > >>> > >> > > > > > > > A: Add topicConfigs map property and > > > > change > > > > > >>> > >> > constructor, > > > > > >>> > >> > > > this > > > > > >>> > >> > > > > > > > shouldn't break Consumer/Producer since TMR > is > > > > used > > > > > in > > > > > >>> > >> > > > NetworkClient, > > > > > >>> > >> > > > > > > > not directly by Consumer/Producer. > > > > > >>> > >> > > > > > > > > > > > > >>> > >> > > > > > > > 300. Q: Can we merge AlterTopic and > > > > > ReassignPartitions > > > > > >>> > >> > requests? > > > > > >>> > >> > > > > > > > A: It looks like in terms of Wire > > > Protocol > > > > > >>> > partition > > > > > >>> > >> > > > > > reassignment > > > > > >>> > >> > > > > > > > can > > > > > >>> > >> > > > > > > > be just an application of AlterTopicRequest. > > On > > > > the > > > > > >>> > >> AdminClient > > > > > >>> > >> > > > side > > > > > >>> > >> > > > > we > > > > > >>> > >> > > > > > > can > > > > > >>> > >> > > > > > > > split this into two separate methods, if > > needed. > > > > > >>> > >> > > > > > > > > > > > > >>> > >> > > > > > > > Some additional items that were added today: > > > > > >>> > >> > > > > > > > 400. Q: Do we need ListTopicsRequest, we can > > use > > > > TMR > > > > > >>> for > > > > > >>> > >> this > > > > > >>> > >> > > > > purpose. > > > > > >>> > >> > > > > > > > A: The answer depends on whether we > > can > > > > > >>> leverage > > > > > >>> > >> > > ListTopics > > > > > >>> > >> > > > > in > > > > > >>> > >> > > > > > > > consumer/producer, because the only benefit > of > > > the > > > > > >>> > >> ListTopics > > > > > >>> > >> > is > > > > > >>> > >> > > > > > > > performance > > > > > >>> > >> > > > > > > > optimization, otherwise it doesn't worth it. > > > > > >>> > >> > > > > > > > > > > > > >>> > >> > > > > > > > 401. Q: AdminClient.topicExists - do we need > > it? > > > > > >>> > >> > > > > > > > A: AdminClient.listTopics should be > > > > > >>> sufficient. > > > > > >>> > >> > > > > > > > > > > > > >>> > >> > > > > > > > 402. Review AdminClient API and use separate > > > > objects > > > > > >>> > >> instead of > > > > > >>> > >> > > > > > > collections > > > > > >>> > >> > > > > > > > for methods arguments / return results (e.g. > > > > > >>> > >> preferredReplica > > > > > >>> > >> > > > accepts > > > > > >>> > >> > > > > > > > Map<String, List<Int>> > > > > > >>> > >> > > > > > > > might be better to add separate java object) > > > > > >>> > >> > > > > > > > > > > > > >>> > >> > > > > > > > 403. Error number in KIP-4 (100x). Currently > > > there > > > > > >>> are no > > > > > >>> > >> > > dedicated > > > > > >>> > >> > > > > > > ranges > > > > > >>> > >> > > > > > > > for errors, we will probably continue doing > it > > > > this > > > > > >>> way. > > > > > >>> > >> > > > > > > > > > > > > >>> > >> > > > > > > > 404. There were some concerns again about > the > > > > > >>> asynchronous > > > > > >>> > >> > > > semantics > > > > > >>> > >> > > > > > > > of the admin requests. Jun and Jay to agree > > > > > >>> separately how > > > > > >>> > >> we > > > > > >>> > >> > > want > > > > > >>> > >> > > > > > > > to handle it. > > > > > >>> > >> > > > > > > > > > > > > >>> > >> > > > > > > > Please add / correct me if I missed > something. > > > > > >>> > >> > > > > > > > > > > > > >>> > >> > > > > > > > Thanks, > > > > > >>> > >> > > > > > > > Andrii Biletskyi > > > > > >>> > >> > > > > > > > > > > > > >>> > >> > > > > > > > On Tue, Apr 7, 2015 at 4:11 PM, Andrii > > > Biletskyi < > > > > > >>> > >> > > > > > > > andrii.bilets...@stealth.ly> wrote: > > > > > >>> > >> > > > > > > > > > > > > >>> > >> > > > > > > > > Hi all, > > > > > >>> > >> > > > > > > > > > > > > > >>> > >> > > > > > > > > I wasn't able to send email to our thread > > (it > > > > says > > > > > >>> we > > > > > >>> > >> > exceeded > > > > > >>> > >> > > > > > message > > > > > >>> > >> > > > > > > > > size limit :)). > > > > > >>> > >> > > > > > > > > So I'm starting the new one. > > > > > >>> > >> > > > > > > > > > > > > > >>> > >> > > > > > > > > > > > > > >>> > >> > > > > > > > > Jun, > > > > > >>> > >> > > > > > > > > > > > > > >>> > >> > > > > > > > > Thanks again for the review. Answering > your > > > > > >>> comments: > > > > > >>> > >> > > > > > > > > > > > > > >>> > >> > > > > > > > > 201. I'm not sure I understand how can we > > > evolve > > > > > >>> Cluster > > > > > >>> > >> in > > > > > >>> > >> > > > > backward > > > > > >>> > >> > > > > > > > > compatible way. In my understanding topic > > > > configs > > > > > >>> are > > > > > >>> > not > > > > > >>> > >> > > > returned > > > > > >>> > >> > > > > > > > > currently - > > > > > >>> > >> > > > > > > > > in TMR_V0. Thus we need to add new > property > > in > > > > > >>> Cluster - > > > > > >>> > >> smth > > > > > >>> > >> > > > like > > > > > >>> > >> > > > > > > > > private final Map<String, > List<ConfigEntry>> > > > > > >>> > topicConfigs; > > > > > >>> > >> > > > > > > > > Which affects Cluster constructor, which > is > > > used > > > > > in > > > > > >>> > >> > > > > > > MetadataResponse.java > > > > > >>> > >> > > > > > > > > - not sure whether we can change Cluster > > this > > > > way > > > > > so > > > > > >>> > it's > > > > > >>> > >> > > > backward > > > > > >>> > >> > > > > > > > > compatible, > > > > > >>> > >> > > > > > > > > I suppose - no. > > > > > >>> > >> > > > > > > > > Let me know if I'm missing something... > > > > > >>> > >> > > > > > > > > > > > > > >>> > >> > > > > > > > > 300. Hm, so you propose to give up > > > > > >>> ReassignPartition as > > > > > >>> > a > > > > > >>> > >> > > > separate > > > > > >>> > >> > > > > > > > command? > > > > > >>> > >> > > > > > > > > That's interesting, let's discuss it today > > in > > > > > >>> detail. > > > > > >>> > >> > > > > > > > > Two small points here: 1) afaik currently > > > > > >>> > >> replica-assignment > > > > > >>> > >> > > > > argument > > > > > >>> > >> > > > > > > in > > > > > >>> > >> > > > > > > > > alter-topic > > > > > >>> > >> > > > > > > > > (from TopicCommand) doesn't reassign > > > partitions, > > > > > it > > > > > >>> lets > > > > > >>> > >> > users > > > > > >>> > >> > > > > > specify > > > > > >>> > >> > > > > > > > > replica > > > > > >>> > >> > > > > > > > > assignment for newly added partition > > > > > >>> > >> (AddPartitionsListener) > > > > > >>> > >> > 2) > > > > > >>> > >> > > > > > > > > ReassignPartitions > > > > > >>> > >> > > > > > > > > command involves a little bit more than > just > > > > > >>> changing > > > > > >>> > >> replica > > > > > >>> > >> > > > > > > assignment > > > > > >>> > >> > > > > > > > > in zk. > > > > > >>> > >> > > > > > > > > People are struggling with partition > > > > reassignment > > > > > >>> so I > > > > > >>> > >> think > > > > > >>> > >> > > it's > > > > > >>> > >> > > > > > good > > > > > >>> > >> > > > > > > to > > > > > >>> > >> > > > > > > > > have explicit > > > > > >>> > >> > > > > > > > > request for it so we can handle it > > > > independently, > > > > > >>> also > > > > > >>> > as > > > > > >>> > >> > > > mentioned > > > > > >>> > >> > > > > > > > > earlier we'll > > > > > >>> > >> > > > > > > > > probably add in future some better status > > > check > > > > > >>> > procedure > > > > > >>> > >> for > > > > > >>> > >> > > > this > > > > > >>> > >> > > > > > > > > long-running > > > > > >>> > >> > > > > > > > > request. > > > > > >>> > >> > > > > > > > > > > > > > >>> > >> > > > > > > > > 301. Good point. We also agreed to use > > > clientId > > > > as > > > > > >>> an > > > > > >>> > >> > > identifier > > > > > >>> > >> > > > > for > > > > > >>> > >> > > > > > > the > > > > > >>> > >> > > > > > > > > requester - > > > > > >>> > >> > > > > > > > > whether it's a producer client or admin. I > > > think > > > > > we > > > > > >>> can > > > > > >>> > go > > > > > >>> > >> > with > > > > > >>> > >> > > > > -1/-1 > > > > > >>> > >> > > > > > > > > approach. > > > > > >>> > >> > > > > > > > > > > > > > >>> > >> > > > > > > > > 302. Again, as said above > replica-assignment > > > in > > > > > >>> > >> alter-topic > > > > > >>> > >> > > > doesn't > > > > > >>> > >> > > > > > > > change > > > > > >>> > >> > > > > > > > > replica assignment of the existing > > partitions. > > > > But > > > > > >>> we > > > > > >>> > can > > > > > >>> > >> > think > > > > > >>> > >> > > > > about > > > > > >>> > >> > > > > > > it > > > > > >>> > >> > > > > > > > > in general - > > > > > >>> > >> > > > > > > > > how can we change topic replication > factor? > > > The > > > > > >>> easy way > > > > > >>> > >> - we > > > > > >>> > >> > > > don't > > > > > >>> > >> > > > > > > need > > > > > >>> > >> > > > > > > > > it, > > > > > >>> > >> > > > > > > > > we can use reassign partitions. Not sure > > > whether > > > > > we > > > > > >>> want > > > > > >>> > >> to > > > > > >>> > >> > add > > > > > >>> > >> > > > > > special > > > > > >>> > >> > > > > > > > > logic > > > > > >>> > >> > > > > > > > > to treat this case... > > > > > >>> > >> > > > > > > > > > > > > > >>> > >> > > > > > > > > 303.1. Okay, sure, I'll generalize > > > > topicExists(). > > > > > >>> > >> > > > > > > > > 303.2. I think, yes, we need separate > verify > > > > > >>> methods as > > > > > >>> > a > > > > > >>> > >> > > status > > > > > >>> > >> > > > > > check > > > > > >>> > >> > > > > > > > > procedure, > > > > > >>> > >> > > > > > > > > because respective requests are long > > running, > > > > and > > > > > >>> CLI > > > > > >>> > user > > > > > >>> > >> > > > > > potentially > > > > > >>> > >> > > > > > > > > will > > > > > >>> > >> > > > > > > > > asynchronously call reassign-partitions, > do > > > > other > > > > > >>> stuff > > > > > >>> > >> (e.g. > > > > > >>> > >> > > > > create > > > > > >>> > >> > > > > > > > > topics) periodically > > > > > >>> > >> > > > > > > > > checking status of the partition > > reassignment. > > > > > >>> Anyway > > > > > >>> > >> we'll > > > > > >>> > >> > > have > > > > > >>> > >> > > > to > > > > > >>> > >> > > > > > > > > implement this logic > > > > > >>> > >> > > > > > > > > because it's a criterion of the completed > > > Future > > > > > of > > > > > >>> the > > > > > >>> > >> > > reassign > > > > > >>> > >> > > > > > > > > partitions async > > > > > >>> > >> > > > > > > > > call, we'll to have make those methods > just > > > > > public. > > > > > >>> > >> > > > > > > > > 303.3. If preferredReplica returns > > > > > >>> Future<Map<String, > > > > > >>> > >> > Errors>> > > > > > >>> > >> > > > than > > > > > >>> > >> > > > > > > what > > > > > >>> > >> > > > > > > > > is an error > > > > > >>> > >> > > > > > > > > in terms of preferred replica leader > > election? > > > > As > > > > > I > > > > > >>> > >> > understand > > > > > >>> > >> > > we > > > > > >>> > >> > > > > can > > > > > >>> > >> > > > > > > > only > > > > > >>> > >> > > > > > > > > check > > > > > >>> > >> > > > > > > > > whether it has succeeded (leader == > AR.head) > > > or > > > > > not > > > > > >>> > >> _yet_. > > > > > >>> > >> > > > > > > > > > > > > > >>> > >> > > > > > > > > 304.1. Sure, let's add timeout to > > > > > reassign/preferred > > > > > >>> > >> replica. > > > > > >>> > >> > > > > > > > > 304.2. This can be finalized after we > > discuss > > > > 300. > > > > > >>> > >> > > > > > > > > > > > > > >>> > >> > > > > > > > > 305. Misprints - thanks, fixed. > > > > > >>> > >> > > > > > > > > > > > > > >>> > >> > > > > > > > > Thanks, > > > > > >>> > >> > > > > > > > > Andrii Biletskyi > > > > > >>> > >> > > > > > > > > > > > > > >>> > >> > > > > > > > > > > > > >>> > >> > > > > > > > > > > > >>> > >> > > > > > > > > > > >>> > >> > > > > > > > > > >>> > >> > > > > > > > > >>> > >> > > > > > > > >>> > >> > > > > > > >>> > >> > > > > > >>> > > > > > > > >>> > > > > > > > >>> > > > > > > >>> > > > > > >> > > > > > >> > > > > > > > > > > > > > > > > > > > > >