Hi Colin, Thanks for the suggestions. It is a good idea to refer to the '__data_version__' as just 'epoch', to avoid any confusions. However note that this is not the same as broker epoch. The main distinction is that this epoch is bumped by the controller whenever a modification made to the finalized feature versions is persisted into ZK.
I have updated the KIP to use the new schema for the ‘/features’ ZK node: - We use 2 separate fields ‘epoch’ and ‘version’. The latter describing changes to the overall schema of the data that is written to ZooKeeper in the '/features' node. - We don’t have a header and a data section separately, I have clubbed these so that we have just 1 dictionary containing both. Here is a link to the updated section: https://cwiki.apache.org/confluence/display/KAFKA/KIP-584 <https://issues.apache.org/jira/browse/KIP-584> %3A+Versioning+scheme+for+features#KIP-584 <https://issues.apache.org/jira/browse/KIP-584>:Versioningschemeforfeatures-Persistenceoffinalizedfeatureversions . Please feel free to let me know if you have any questions or concerns. Cheers, Kowshik On Mon, Mar 30, 2020 at 4:53 PM Colin McCabe <cmcc...@apache.org> wrote: > On Thu, Mar 26, 2020, at 19:24, Kowshik Prakasam wrote: > > Hi Colin, > > > > Thanks for the feedback! I've changed the KIP to address your > > suggestions. > > Please find below my explanation. Here is a link to KIP 584: > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features > > . > > > > 1. '__data_version__' is the version of the finalized feature metadata > > (i.e. actual ZK node contents), while the '__schema_version__' is the > > version of the schema of the data persisted in ZK. These serve different > > purposes. '__data_version__' is is useful mainly to clients during reads, > > to differentiate between the 2 versions of eventually consistent > 'finalized > > features' metadata (i.e. larger metadata version is more recent). > > '__schema_version__' provides an additional degree of flexibility, where > if > > we decide to change the schema for '/features' node in ZK (in the > future), > > then we can manage broker roll outs suitably (i.e. > > serialization/deserialization of the ZK data can be handled safely). > > Hi Kowshik, > > If you're talking about a number that lets you know if data is more or > less recent, we would typically call that an epoch, and not a version. For > the ZK data structures, the word "version" is typically reserved for > describing changes to the overall schema of the data that is written to > ZooKeeper. We don't even really change the "version" of those schemas that > much, since most changes are backwards-compatible. But we do include that > version field just in case. > > I don't think we really need an epoch here, though, since we can just look > at the broker epoch. Whenever the broker registers, its epoch will be > greater than the previous broker epoch. And the newly registered data will > take priority. This will be a lot simpler than adding a separate epoch > system, I think. > > > > > 2. Regarding admin client needing min and max information - you are > right! > > I've changed the KIP such that the Admin API also allows the user to read > > 'supported features' from a specific broker. Please look at the section > > "Admin API changes". > > Thanks. > > > > > 3. Regarding the use of `long` vs `Long` - it was not deliberate. I've > > improved the KIP to just use `long` at all places. > > Sounds good. > > > > > 4. Regarding kafka.admin.FeatureCommand tool - you are right! I've > updated > > the KIP sketching the functionality provided by this tool, with some > > examples. Please look at the section "Tooling support examples". > > > > Thank you! > > > Thanks, Kowshik. > > cheers, > Colin > > > > > > > Cheers, > > Kowshik > > > > On Wed, Mar 25, 2020 at 11:31 PM Colin McCabe <cmcc...@apache.org> > wrote: > > > > > Thanks, Kowshik, this looks good. > > > > > > In the "Schema" section, do we really need both __schema_version__ and > > > __data_version__? Can we just have a single version field here? > > > > > > Shouldn't the Admin(Client) function have some way to get the min and > max > > > information that we're exposing as well? I guess we could have min, > max, > > > and current. Unrelated: is the use of Long rather than long deliberate > > > here? > > > > > > It would be good to describe how the command line tool > > > kafka.admin.FeatureCommand will work. For example the flags that it > will > > > take and the output that it will generate to STDOUT. > > > > > > cheers, > > > Colin > > > > > > > > > On Tue, Mar 24, 2020, at 17:08, Kowshik Prakasam wrote: > > > > Hi all, > > > > > > > > I've opened KIP-584 <https://issues.apache.org/jira/browse/KIP-584> > <https://issues.apache.org/jira/browse/KIP-584> > > > > which > > > > is intended to provide a versioning scheme for features. I'd like to > use > > > > this thread to discuss the same. I'd appreciate any feedback on this. > > > > Here > > > > is a link to KIP-584 <https://issues.apache.org/jira/browse/KIP-584> > : > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features > > > > . > > > > > > > > Thank you! > > > > > > > > > > > > Cheers, > > > > Kowshik > > > > > > > > > >