Hi Prashant, Thank you for the invitation. I'll be at the community meeting, Friday sync. Is [1] the right venue and info to jump in? Also, I'm already in #ozone channel of the Apache Slack, so feel free to ask any questions there.
[1] https://cwiki.apache.org/confluence/display/OZONE/Ozone+Community+Calls On Tue, Feb 1, 2022 at 1:44 AM Prashant Pogde <ppo...@cloudera.com.invalid> wrote: > > Hi Kota, > > I went through your proposal and it looks good. > Let us discuss this in our next ozone community meeting as well. Let us > connect on apache slack. > > Regards, > Prashant > > > > On Jan 27, 2022, at 11:50 PM, Kota Uenishi <k...@preferred.jp> wrote: > > > > Hi Ozone dev, > > > > I once proposed fix for HDDS-5905, but it's been a while. Now our > > cluster got stable after a few work and I've got time to resume my > > work on HDDS-5905. - and I came up to face a design decision on key > > formatting again, as I learned more in detail about Ozone internals. > > > > Bharat once gave me an advice [1] to use object IDs instead of > > transaction index (and instead of timestamps), to address restart and > > cluster upgrade to Ratis. But it has a drawback on object overwrite > > and I came up with another design choice. They are: > > > > 1. Use object IDs as a key in the delete table > > Pros: object IDs are consistently used in OM and easy to pick up in > > RocksDB batch. > > Cons: > > - On objects being overwrite, object ID of the key is not updated, > > while previous blocks > > of the overwritten key are eligible for deletion (see HDDS-5461 and > > HDDS-5656). > > Under this condition, there are a race where blocks gets lost and > > will never be > > collected. Example scenario is like: > > > > key open oid=1 > > key commit > > key open (overwrite) oid=1’ #<= oid must be updated on overwrite, or > > use update id > > key delete oid=1 > > key commit > > key delete oid=1’ (<= overwritten and previous block gets leaked) > > deletion service deletes 1’ > > > > This behavior should be changed as to assign new oid=2 on overwrite. > > - In addition to the need of this fix, blocks are deleted in the > > order of key open, > > not in the order of key deletion. It's better than alphabetical > > order, but not > > perfect. > > > > 2. Use update IDs as a key in the delete table > > Pros: The design is cleaner and the order of block deletion will be correct. > > Cons: > > - Currently, assignment of update IDs are somewhat fuzzy. In most places > > raw transaction index, in some places object ID is used as-is e.g. > > directory > > creation (See OMDirectoryCreateRequest.java). > > - A fix on the update ID assignment would be prefix them with epoch nubmer > > as well as object ID, but most part of setting update ID should be fixed. > > > > I feel 1. is easier but a bit not correct, while 2 is more correct but > > the required change is wide. I updated my proposal accordingly [2], so > > please let me know your thoughts on which to choose. Also, my messy > > working branch can be found here [3]. > > > > P.S. my fix on HDDS-5905 conflicts and depends on HDDS-5656, because > > it's also about key deletion and overwrite. I want to get it reviewed > > and merged beforehand. It's kinda leftover task from HDDS-5461 and > > should be merged for 1.3. > > > > [1] https://lists.apache.org/thread/79qgx598rv3qcojmzoxhc9ypkh1jj64y > > [2] > > https://docs.google.com/document/d/1KeyhiE1i5SqRSgLy-pIOGW9X6mUYb8iYEkEoDAEQD9Q/edit#heading=h.nqxuhw78zsv7 > > [3] https://github.com/kuenishi/ozone/pull/1 > > > > -- > > -- > > Kota UENISHI, Engineer > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org > > For additional commands, e-mail: dev-h...@ozone.apache.org > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org > For additional commands, e-mail: dev-h...@ozone.apache.org > -- -- Kota UENISHI, Engineer --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org For additional commands, e-mail: dev-h...@ozone.apache.org