Hi Igor,

Thanks for your replies on points 21-25. Those all LGTM. See below for a
further thought on the meta.properties upgrade.

Kind regards,

Tom


On Fri, 13 Jan 2023 at 18:09, Igor Soarez <soa...@apple.com.invalid> wrote:

> Hi Tom,
>
> Thank you for having another look.
>
> 20. Upon a downgrade to a Kafka version that runs the current
> "version == 1" assertion, then yes — a downgrade would not be possible
> without first updating (manually) the meta.properties files back
> to the previous version.
>
> We could prevent this issue if we consider the new fields optional and
> not bump the version, but this takes away from the value of having
> a version in the first place. I'm not sure this would be a good trade-off.
>
> IIUC in KIP-866 this issue was addressed by delaying the writing of
> meta.properties under the new version until after the full transition to
> KRaft mode is finished and relaxing the expectations around versioning.
> We could take a similar approach by relaxing the current
> expectation that the version must strictly be 1.
>
> Currently, MetaProperties#parse() checks that the version is 1 but only
> requires that the cluster.id and node.id properties are present in the
> file.
> We can relax the requirement, ensuring only that the version is at least 1,
> and land that change before this KIP is implemented – at least one release
> beforehand. Then there will be some previous version for which a downgrade
> is possible, even after the version has been bumped to 2 and the two new
> properties are introduced. WDYT?
>

Doing that would delay JBOD support and mean existing users of JBOD must
upgrade via that specific version.

I think I favour a different way of relaxing the expectation around
versioning:
1. In the new broker, update the meta.properties with the new keys but
keeping version=1 (AFAICS this won't cause problems if we roll back to old
broker binaries at any point prior to the completion of KRaft upgrade,
because the code ignores unknown keys).
2. When the KRaft upgrade is complete (i.e. once a broker has written the
final ZkMigrationRecord), update meta.properties to version=2, where the
new keys are required.

That would mean that during the upgrade the additional validation of
meta.properties is missing. I think that's acceptable because the extra
protection provided isn't something supported by ZK-based JBOD today, so
there is no loss of functionality.

Wdty?


> 21. I think in this case we can reuse LOG_DIR_NOT_FOUND (code 57) and
> perhaps
> reword its description slightly to be a bit more generic.
> I've updated the KIP to mention this.
>
> 22. This check should be enforced by the broker, as the controller cannot
> know whether the current replica to logdir assignment is correct.
> Currently, the controller does not unfence a broker that does not "want"
> to be unfenced. The broker should not want to be unfenced while there
> are still mismatching (wrong or missing) replica to logdir assignments.
> I updated the KIP to clarify this.
>
> So a reason wouldn't be included in the heartbeat response but your point
> about diagnosing is interesting. So I've updated the KIP to propose a
> new broker metric reporting the number of mismatching replica assignments
> - i.e. replica assignments that are either missing or incorrect in the
> cluster metadata, as observed by the broker — until this metric is zero,
> the broker would not want to be unfenced. Let me know what you think.
>
> 23. I don't see why not. We should be able to extend the system tests
> proposed in KIP-866 to cover this.
>
> 24. Correct. Updated to remove the typo.
>
> 25. Good point. Updated to align the names.
>
>
> Thank you,
>
> --
> Igor
>
>
>

Reply via email to