Jun,

404. Great, thanks!

500. If I understand correctly KAFKA-1367 says ISR part of TMR may
be inconsistent. If so, then I believe all admin commands but describeTopic
are not affected. Let me emphasize that it's about AdminClient operations,
not about Wire Protocol requests. What I mean:
To verify AdminClient.createTopic we will need (consistent) 'topics' set
from TMR (we don't need isr)
To verify alterTopic - again, probably 'topics' and 'assigned replicas' +
configs
To verify deleteTopic - only 'topics'
To verify preferredReplica - 'leader', 'assigned replicas'
To verify reassignPartitions - 'assigned replicas' ? (I'm not sure about
this one)
If everything above is correct, then AdminClient.describeTopic is the only
command under risk. We can actually workaround it - find out the leader
broker
and ask TMR that leading broker to get up-to-date isr list.
Bottom line: looks like 1367 is a separate issue, and is not a blocker for
this
KIP. I'm a bit concerned about adding new requests as a must-have part
of this KIP when we don't know what we want to include to those requests.

Also, I'd like to write down the new AlterTopicRequest semantics (if we
decide
to include replicas there and merge it with ReassignPartitionsRequest)
300. AlterTopicRequest => [TopicName Partitions Replicas ReplicaAssignment
[AddedConfigEntry] [DeletedConfig]]
The fields are resolved in this sequence:
1. Either partition or replicas is defined:
---1.1. ReplicaAssignment is not defined - generate automatic replica
assignment
          for newly added partitions or for replicas parameter increased
---1.2. ReplicaAssignment is defined - increase topic partitions if
'partitions' defined,
          reassign partitions according to ReplicaAssignment
2. Neither partition nor replicas is defined:
---2.1. ReplicaAssignment is defined - it's a reassign replicas request
---2.2. ReplicaAssignment is not defined - just change topic configs
3. Config fields are handled always and independently from
    partitions+replicas/replicaAssingment
A bit sophisticated, but should cover all cases. Another option - we can
say you can define either partitions+replicas or replicaAssignment.

300.5. There is also a new question related to AlterTopicRequest - should we
allow users multiple alter-topic instructions for one topic in one batch?
I think if we go this way, user will expect we optimize and group requests
for one topic, but it will add a lot of burden, especially taken into
account
async semantics of the AlterTopicRequest. I'd rather return some error code,
or ignore all but first. Thoughts?

Thanks,
Andrii Biletskyi

On Mon, Apr 13, 2015 at 6:34 AM, Jun Rao <j...@confluent.io> wrote:

> Andrii,
>
> 404. Jay and I chatted a bit. We agreed to leave createTopicRequest as
> async for now.
>
> There is another thing.
>
> 500. Currently, we have this issue (KAFKA-1367) that the ISR in the
> metadata cache can be out of sync. The reason is that ISR is really
> maintained at the leader. We can potentially add a new BrokerMetaRequest,
> which will return useful stats specific to a broker. Such stats can include
> (1) for each partition whose leader is on this broker, the ISR and the lag
> (in messages) for each of the followers, (2) space used per partition, (3)
> remaining space per log dir (not sure how easy it is to get this info). If
> we have this new request, we can probably remove the ISR part from TMR v1.
> Currently, the producer/consumer client don't really care about ISR. The
> admin client will then issue BrokerMetaRequest to find out ISR and other
> stats.
>
> Thanks,
>
> Jun
>
>
> On Tue, Apr 7, 2015 at 12:10 PM, Andrii Biletskyi <
> andrii.bilets...@stealth.ly> wrote:
>
> > Hi all,
> >
> > A summary of our discussion:
> >
> > 201. Q: Cluster updates in backward compatible way.
> >         A: Add topicConfigs map property and change constructor, this
> > shouldn't break Consumer/Producer since TMR is used in NetworkClient,
> > not directly by Consumer/Producer.
> >
> > 300. Q: Can we merge AlterTopic and ReassignPartitions requests?
> >         A: It looks like in terms of Wire Protocol partition reassignment
> > can
> > be just an application of AlterTopicRequest. On the AdminClient side we
> can
> > split this into two separate methods, if needed.
> >
> > Some additional items that were added today:
> > 400. Q: Do we need ListTopicsRequest, we can use TMR for this purpose.
> >         A: The answer depends on whether we can leverage ListTopics in
> > consumer/producer, because the only benefit of the ListTopics is
> > performance
> > optimization, otherwise it doesn't worth it.
> >
> > 401. Q: AdminClient.topicExists - do we need it?
> >         A: AdminClient.listTopics should be sufficient.
> >
> > 402. Review AdminClient API and use separate objects instead of
> collections
> > for methods arguments / return results (e.g. preferredReplica accepts
> > Map<String, List<Int>>
> > might be better to add separate java object)
> >
> > 403. Error number in KIP-4 (100x). Currently there are no dedicated
> ranges
> > for errors, we will probably continue doing it this way.
> >
> > 404. There were some concerns again about the asynchronous semantics
> > of the admin requests. Jun and Jay to agree separately how we want
> > to handle it.
> >
> > Please add / correct me if I missed something.
> >
> > Thanks,
> > Andrii Biletskyi
> >
> > On Tue, Apr 7, 2015 at 4:11 PM, Andrii Biletskyi <
> > andrii.bilets...@stealth.ly> wrote:
> >
> > > Hi all,
> > >
> > > I wasn't able to send email to our thread (it says we exceeded message
> > > size limit :)).
> > > So I'm starting the new one.
> > >
> > >
> > > Jun,
> > >
> > > Thanks again for the review. Answering your comments:
> > >
> > > 201. I'm not sure I understand how can we evolve Cluster in backward
> > > compatible way. In my understanding topic configs are not returned
> > > currently -
> > > in TMR_V0. Thus we need to add new property in Cluster - smth like
> > > private final Map<String, List<ConfigEntry>> topicConfigs;
> > > Which affects Cluster constructor, which is used in
> MetadataResponse.java
> > > - not sure whether we can change Cluster this way so it's backward
> > > compatible,
> > > I suppose - no.
> > > Let me know if I'm missing something...
> > >
> > > 300. Hm, so you propose to give up ReassignPartition as a separate
> > command?
> > > That's interesting, let's discuss it today in detail.
> > > Two small points here: 1) afaik currently replica-assignment argument
> in
> > > alter-topic
> > > (from TopicCommand) doesn't reassign partitions, it lets users specify
> > > replica
> > > assignment for newly added partition (AddPartitionsListener) 2)
> > > ReassignPartitions
> > > command involves a little bit more than just changing replica
> assignment
> > > in zk.
> > > People are struggling with partition reassignment so I think it's good
> to
> > > have explicit
> > > request for it so we can handle it independently, also as mentioned
> > > earlier we'll
> > > probably add in future some better status check procedure for this
> > > long-running
> > > request.
> > >
> > > 301. Good point. We also agreed to use clientId as an identifier for
> the
> > > requester -
> > > whether it's a producer client or admin. I think we can go with -1/-1
> > > approach.
> > >
> > > 302. Again, as said above replica-assignment in alter-topic doesn't
> > change
> > > replica assignment of the existing partitions. But we can think about
> it
> > > in general -
> > > how can we change topic replication factor? The easy way - we don't
> need
> > > it,
> > > we can use reassign partitions. Not sure whether we want to add special
> > > logic
> > > to treat this case...
> > >
> > > 303.1. Okay, sure, I'll generalize topicExists().
> > > 303.2. I think, yes, we need separate verify methods as a status check
> > > procedure,
> > > because respective requests are long running, and CLI user potentially
> > > will
> > > asynchronously call reassign-partitions, do other stuff (e.g. create
> > > topics) periodically
> > > checking status of the partition reassignment. Anyway we'll have to
> > > implement this logic
> > > because it's a criterion of the completed Future of the reassign
> > > partitions async
> > > call, we'll to have make those methods just public.
> > > 303.3. If preferredReplica returns Future<Map<String, Errors>> than
> what
> > > is an error
> > > in terms of preferred replica leader election? As I understand we can
> > only
> > > check
> > > whether it has succeeded (leader == AR.head)  or not _yet_.
> > >
> > > 304.1. Sure, let's add timeout to reassign/preferred replica.
> > > 304.2. This can be finalized after we discuss 300.
> > >
> > > 305. Misprints - thanks, fixed.
> > >
> > > Thanks,
> > > Andrii Biletskyi
> > >
> >
>

Reply via email to