Kowshik, thanks for the review! 7001. An enum sounds like a good idea here. Especially since setting Upgrade=false and Force=true doesn't really make sense. An enum would avoid confusing/invalid combinations of flags
7002. How about adding --force-downgrade as an alternative to the --downgrade argument? So, it would take the same arguments (list of feature:version), but use the DOWNGRADE_UNSAFE option in the RPC. 7003. Yes we will need the advanced CLI since we will need to only modify the "metadata.version" FF. I was kind of wondering if we might want separate sub-commands for the different operations instead of all under "update". E.g., "kafka-features.sh upgrade|downgrade|force-downgrade|delete|describe". 7004/7005. The intention in this KIP is that we bump the "metadata.version" liberally and that most upgrades are backwards compatible. We're relying on this for feature gating as well as indicating compatibility. The enum is indeed an implementation detail that is enforced by the controller when handling UpdateFeaturesRequest. As for lossy downgrades, this really only applies to the metadata log as we will lose some fields and records when downgrading to an older version. This is useful as an escape hatch for cases when a software upgrade has occurred, the feature flag was increased, and then bugs are discovered. Without the lossy downgrade scenario, we have no path back to a previous software version. As for the min/max finalized version, I'm not totally clear on cases where these would differ. I think for "metadata.version" we just want a single finalized version for the whole cluster (like we do for IBP). -David On Thu, Oct 14, 2021 at 1:59 PM José Armando García Sancio <jsan...@confluent.io.invalid> wrote: > On Tue, Oct 12, 2021 at 10:57 AM Colin McCabe <cmcc...@apache.org> wrote: > > > 11. For downgrades, it would be useful to describe how to determine the > > > downgrade process (generating new snapshot, propagating the snapshot, > etc) > > > has completed. We could block the UpdateFeature request until the > process > > > is completed. However, since the process could take time, the request > could > > > time out. Another way is through DescribeFeature and the server only > > > reports downgraded versions after the process is completed. > > > > Hmm.. I think we need to avoid blocking, since we don't know how long it > will take for all nodes to act on the downgrade request. After all, some > nodes may be down. > > > > But I agree we should have some way of knowing when the upgrade is done. > DescribeClusterResponse seems like the natural place to put information > about each node's feature level. While we're at it, we should also add a > boolean to indicate whether the given node is fenced. (This will always be > false for ZK mode, of course...) > > > > I agree. I think from the user's point of view, they would like to > know if it is safe to downgrade the software version of a specific > broker or controller. I think that it is safe to downgrade a broker or > controller when the metadata.version has been downgraded and the node > has generated a snapshot with the downgraded metadata.version. > > -- > -Jose > -- David Arthur