Hey Jason, Yes, I also think the right solution is probably to include the expected_replication_factor in the per-topic znode and include this information in the UpdateMetadataRequest.
And I agree with Gwen and Ismael that it is reasonable to redefine URP as those partitions whose ISR set size < expected replication factor. Given that URP is being used for monitoring cluster availability and and reassignment progress, it seems that one of them will break depending on the URP definition. It seems better to keep the legitimate use-case, i.e. monitoring cluster availability. Users who want to monitor reassignment progress probably should use the "kafka-reassign-partitions.sh --verify". Thanks, Dong On Fri, Aug 3, 2018 at 2:57 PM, Jason Gustafson <ja...@confluent.io> wrote: > @Dong > > Ugh, you are right. This is indeed trickier than I imagined. You could > argue that the problem here is that there is no single source of truth for > the expected replication factor. While a reassignment is in progress, the > reassignment itself has the expected replica set. Otherwise, it is stored > partition metadata itself. This is why manually deleting the reassignment > is problematic. We lose the expected replica set and we depend on users to > reinstall it. I guess I'm starting to think that the way we track > reassignment state in the controller is problematic. In addition to the > problems caused by deletion, we cannot easily change an existing > reassignment. > > High level what I'm thinking is that we need to move the pending > reassignment state out of the single znode and into the individual metadata > of the reassigned partitions so that there is a single source of truth for > the expected replicas (and hence the replication factor). This would also > give us an easier mechanism to manage the batching of multiple > reassignments. Let me think on it a bit and see if I can come up with a > proposal. > > @Gwen, @Ismael > > That is fair. I also prefer to redefine URP if we think the compatibility > impact is a lower concern than continued misuse. > > -Jason > > > > On Fri, Aug 3, 2018 at 12:25 PM, Ismael Juma <ism...@juma.me.uk> wrote: > > > RIght, that was my thinking too. > > > > Ismael > > > > On Fri, Aug 3, 2018 at 12:04 PM Gwen Shapira <g...@confluent.io> wrote: > > > > > On Fri, Aug 3, 2018 at 11:23 AM, Jason Gustafson <ja...@confluent.io> > > > wrote: > > > > > > > Hey Ismael, > > > > > > > > Yeah, my initial inclination was to redefine URP as well. My only > doubt > > > was > > > > how it would affect existing tools which might depend on URPs to > track > > > the > > > > progress of a reassignment. I decided to be conservative in the end, > > but > > > > I'd reconsider if we think it is not a major concern. It is annoying > to > > > > need a new category. > > > > > > > > > > There are existing tools that use URP to track reassignment, but there > > are > > > many more tools that use URP for monitoring and alerting. If I > understand > > > Ismael's suggestion correctly, a re-definition will improve the > > reliability > > > of the monitoring tools (since there won't be false alerts in case of > > > re-assignment) without having to switch to a new metric. > > > > > > I think we should choose the proposal that improves the more common > usage > > > of the metric, in this case, failure monitoring rather than > reassignment. > > > > > > > > > > > > > > About your question about storage in ZK, I can't think of anything > > > > additional that we need. Probably the main difficulty is getting > access > > > to > > > > the replication factor in the topic utility. My basic thought was > just > > to > > > > collect the URPs (as we know them today) and use the config API to > > > > partition them based on the replication factor. Do you see any > problems > > > > with this? > > > > > > > > -Jason > > > > > > > > > > > > On Thu, Aug 2, 2018 at 12:14 PM, Ismael Juma <ism...@juma.me.uk> > > wrote: > > > > > > > > > Thanks Jason. This is definitely a pain point. I actually prefer > the > > > > option > > > > > to redefine what under-replicated means (currently under rejected > > > > > alternatives). Also, do we need to make changes to what we store in > > ZK? > > > > If > > > > > so, that should be in the KIP too. > > > > > > > > > > Ismael > > > > > > > > > > On Thu, Aug 2, 2018 at 11:45 AM Jason Gustafson < > ja...@confluent.io> > > > > > wrote: > > > > > > > > > > > Hey All, > > > > > > > > > > > > Another day, another KIP. This one is hopefully straightforward: > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-352% > > > > > 3A+Distinguish+URPs+caused+by+reassignment > > > > > > . > > > > > > Have a look and let me know what you think! > > > > > > > > > > > > Thanks, > > > > > > Jason > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > *Gwen Shapira* > > > Product Manager | Confluent > > > 650.450.2760 | @gwenshap > > > Follow us: Twitter <https://twitter.com/ConfluentInc> | blog > > > <http://www.confluent.io/blog> > > > > > >