I had an offline discussion with Ethan regarding a couple of potential issues:
1. *New Client with ozone.fs.hsync.enabled*: If a new client with ozone.fs.hsync.enabled sends an hsync request to an old OM, the request is sent as a KeyCommit request with additional fields for hsync. The old OM, not recognizing these fields, would treat the request as a normal KeyCommit (i.e., closing the key). The outcome of this is uncertain. 2. *Downgrade Scenario*: If a cluster is upgraded from Ozone 1.4 to 1.5 and the feature flag is enabled, Ozone would allow hsync operations before the upgrade is finalized. This is problematic because version 1.4 does not fully support this feature. Both issues can be addressed with new layout versions. I will open JIRA tickets to include these fixes in the next release, version 1.5.0. I hope this explanation makes sense. On Tue, Jul 30, 2024 at 10:20 AM Ethan Rose <er...@apache.org> wrote: > Hi Wei-Chiu, I'm still unclear on the compatibility requirements of this > feature. > > Feature flags are not a substitute for layout versions. Layout versions are > monotonic: they make sure once something is enabled it cannot be disabled > via a downgrade (the downgraded cluster will fail to start). Feature flags > are not monotonic. If an old version defaults to false and a new version > defaults to true, the flag will get reverted on downgrade while data > written while it was true will still be present. Whether or not this causes > problems is dependent on the implementation of the feature. > > Can you clarify what disk and protocol changes are part of this feature on > OM and Datanodes? Then we can work out what is actually required for > compatibility. > > On Tue, Jul 30, 2024 at 11:43 AM Wei-Chiu Chuang <weic...@apache.org> > wrote: > > > hi Ethan, > > > > The OM layout version change was for rejecting hsync requests to OM until > > the upgrade completes. > > The "HSYC" layout version was shipped in Ozone 1.4.0. > > IIRC I did touch upon this in the release vote thread, that because > there's > > a feature flag ozone.fs.hsync.enabled which disables the feature > entirely, > > the HSYNC layout version is essentially a no-op. We enabled the feature > > flag in the feature branch. If this is a concern we can make it disabled > by > > default again. > > > > On Mon, Jul 29, 2024 at 1:05 AM Ashish kumar <ashis.kr.2...@gmail.com> > > wrote: > > > > > Thanks Ethan for looking into this. > > > > > > >>It looks like this line in the merge checklist was not updated. > > > Updated the checklist. > > > > > > >> Can you go > > > into more details about the OM compatibility for lease recovery or > other > > > operations? > > > Lease recovery is completely redesigned and so both client and server > > needs > > > to be upgraded to make it work correctly. > > > In the old design, lease recovery was dependent only on the client and > > OM, > > > but now it involves datanode as well. > > > Apart from this compatibility is related to "Incremental chunk list" > > which > > > is already taken care of. > > > > > > >> DataNode layout version "HBASE_SUPPORT" > > > Yes this is only related to incremental chunk list. We will update > with a > > > more meaningful name as HBASE_INCREMENTAL_CHUNK_SUPPORT. > > > Also will update the merge checklist in more detail about this. > > > > > > Thanks, > > > Ashish > > > > > > On Fri, Jul 26, 2024 at 12:50 AM Ethan Rose <er...@apache.org> wrote: > > > > > > > Thanks for all the work on this. Looks good overall, just a few > > > > questions on compatibility: > > > > > > > > It looks like this line in the merge checklist was not updated. Can > you > > > go > > > > into more details about the OM compatibility for lease recovery or > > other > > > > operations? > > > > > > > > > A new OM version number was introduced to prevent new client > sending > > > > > atomic key overwrite request to old OM which does not support this > > > > feature. > > > > > > > > > > > > Additionally, new DataNode layout version "HBASE_SUPPORT" was added. > > > > > > > > Can you add some details about what changes on the disk layout after > > this > > > > feature is finalized? Is this related to the incremental chunk list > or > > > > something more? Perhaps a more descriptive layout feature name would > > help > > > > here as well. > > > > > > > > Ethan > > > > > > > > On Wed, Jul 24, 2024 at 11:43 PM Wei-Chiu Chuang <weic...@apache.org > > > > > > wrote: > > > > > > > > > I am +1 (binding) > > > > > On Tue, Jul 23, 2024 at 4:05 AM Ashish kumar < > > ashis.kr.2...@gmail.com> > > > > > wrote: > > > > > > > > > > > Hi Ozone developers, > > > > > > > > > > > > I would like to propose merging HDDS-7593 (HSync and lease > > recovery) > > > > > > feature branch into master. > > > > > > > > > > > > This feature is to support HSync and lease recovery, > > > > > > which enables HBase to run on Ozone. > > > > > > More details about the feature are present in design documents > > > attached > > > > > > in the below mentioned Ozone confluence page link. > > > > > > > > > > > > > > > > > > Checklist for feature branch merge: > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/OZONE/Supporting+HSync+and+lease+recovery+-+HDDS-7593 > > > > > > > > > > > > Feature Jira Link: > > > > > > https://issues.apache.org/jira/browse/HDDS-7593 > > > > > > > > > > > > This vote will be open for at least a week. > > > > > > > > > > > > Thanks, > > > > > > Ashish Kumar > > > > > > > > > > > > > > > > > > > > >