Re: [DISCUSS] 1.2.0 release

2021-08-26 Thread Ethan Rose
+1

On Thu, Aug 26, 2021 at 11:36 AM Vivek Ratnavel 
wrote:

> +1, thanks for volunteering Ethan!
>
>
> On Thu, Aug 26, 2021 at 11:35 AM Prashant Pogde
> 
> wrote:
>
> > +1, thanks Thanks for volunteering!
> >
> > > On Aug 25, 2021, at 11:28 AM, Aravindan Vijayan
> >  wrote:
> > >
> > > Hello all,
> > >
> > > It has been a few months since the Ozone 1.1.0 release, and we have
> had a
> > > number of features (SCM HA, FSO, Non rolling upgrades) that have been
> > > merged to master since then. I think this would be a good time to work
> on
> > > the Ozone 1.2.0 release. Also, Ethan has volunteered to take up the
> role
> > of
> > > the Release Manager for this release.
> > >
> > > Please let me know what you think.
> > > --
> > > Thanks & Regards,
> > > Aravindan
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> > For additional commands, e-mail: dev-h...@ozone.apache.org
> >
> >
>


[VOTE] Apache Ozone 1.2.0 RC0

2021-11-01 Thread Ethan Rose
Hello all,

As discussed earlier, I am calling for a vote on Apache Ozone 1.2.0 RC0.

468 Jiras were resolved as part of this release:
https://issues.apache.org/jira/browse/HDDS-5893?jql=project%20%3D%20HDDS%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%201.2.0

The RC0 tag can be found on Github at:
https://github.com/apache/ozone/releases/tag/ozone-1.2.0-RC0

The source and binary tarballs can be found at:
https://dist.apache.org/repos/dist/dev/ozone/1.2.0-rc0/

Maven artifacts are staged at:
https://repository.apache.org/content/repositories/orgapacheozone-1001/

The public key used to sign the artifacts can be found at:
https://dist.apache.org/repos/dist/dev/ozone/KEYS

The fingerprint of the key used to sign the artifacts is:
00D357B731EBDD66D7CA91B39146CE06B541B7E7

Thanks,
Ethan Rose


Re: [VOTE] Apache Ozone 1.2.0 RC0

2021-11-01 Thread Ethan Rose
Thanks for checking out the release candidate Attila. I see the docs are
missing as well, I will make a new rc with the docs included.

On Mon, Nov 1, 2021 at 1:11 PM Attila Doroszlai 
wrote:

> Hi Ethan,
>
> Thanks for creating RC0.
>
> Signatures are OK for both src and bin tarball.
>
> Docs are missing from the binary archive.  Hugo is required for
> building the docs.
>
> -Attila
>
> On Mon, Nov 1, 2021 at 6:44 PM Ethan Rose 
> wrote:
> >
> > Hello all,
> >
> > As discussed earlier, I am calling for a vote on Apache Ozone 1.2.0 RC0.
> >
> > 468 Jiras were resolved as part of this release:
> >
> https://issues.apache.org/jira/browse/HDDS-5893?jql=project%20%3D%20HDDS%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%201.2.0
> >
> > The RC0 tag can be found on Github at:
> > https://github.com/apache/ozone/releases/tag/ozone-1.2.0-RC0
> >
> > The source and binary tarballs can be found at:
> > https://dist.apache.org/repos/dist/dev/ozone/1.2.0-rc0/
> >
> > Maven artifacts are staged at:
> > https://repository.apache.org/content/repositories/orgapacheozone-1001/
> >
> > The public key used to sign the artifacts can be found at:
> > https://dist.apache.org/repos/dist/dev/ozone/KEYS
> >
> > The fingerprint of the key used to sign the artifacts is:
> > 00D357B731EBDD66D7CA91B39146CE06B541B7E7
> >
> > Thanks,
> > Ethan Rose
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> For additional commands, e-mail: dev-h...@ozone.apache.org
>
>


[VOTE] Apache Ozone 1.2.0 RC1

2021-11-04 Thread Ethan Rose
Hello all,

Fixes have been applied on top of RC0 and I would now like to call a vote
for RC1. RC1 fixes the following issues with RC0:
- Docs are now included in the release tarball.
- HDDS-5908 has been included to fix a potential read failure for multi
part upload keys.
- HDDS-5933 has been included to fix a dependency issue that was causing an
unnecessary size increase in the release tarball.

472 Jiras were resolved as part of this release. RC0 fixed 468 jiras. Each
of the above mentioned jiras had an original fix jira and a cherry pick
jira for the release branch, resulting in the increase of 4 fixes over RC0.
https://issues.apache.org/jira/browse/HDDS-5893?jql=project%20%3D%20HDDS%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%201.2.0

The RC1 tag can be found on Github at:
https://github.com/apache/ozone/releases/tag/ozone-1.2.0-RC1

The source and binary tarballs can be found at:
https://dist.apache.org/repos/dist/dev/ozone/1.2.0-rc1/

Maven artifacts are staged at:
https://repository.apache.org/content/repositories/orgapacheozone-1002/

The public key used to sign the artifacts can be found at:
https://dist.apache.org/repos/dist/dev/ozone/KEYS

The fingerprint of the key used to sign the artifacts is:
00D357B731EBDD66D7CA91B39146CE06B541B7E7

Thanks,
Ethan Rose


Re: [VOTE] Apache Ozone 1.2.0 RC1

2021-11-10 Thread Ethan Rose
Thanks for checking the rc Attila. I have fixed the issue and will send out
a new release candidate soon.

On Mon, Nov 8, 2021 at 5:21 AM Attila Doroszlai 
wrote:

> Thanks Ethan for creating a new RC.  Looks mostly good to me:
>
> * Verified signatures, checksums
> * Verified source matches ozone-1.2.0-RC1 tag
> * Verified docs are present in binary tarball
> * Verified `ozone version` (version number, git revision)
> * Built from source without local Maven cache
> * Executed upgrade acceptance test with both source and binary
>
> However, I noticed one minor issue, in the output of `ozone version`:
>
> Source code repository https://github.com/errose28/hadoop-ozone.git -r
> 9ee6f1872dca8469057d3c7bf880931c0e7b7f3e
> Compiled by ethanrose on 2021-11-04T16:49Z
>
> It should point to the Apache repo.  Here's the output from the
> previous release for comparison:
>
> Source code repository https://github.com/apache/ozone.git -r
> f2406c4b4eb2b1295d06ec379cfe684ed9a16637
> Compiled by ppogde on 2021-04-09T04:10Z
>
> -Attila
>
> On Thu, Nov 4, 2021 at 8:26 PM Ethan Rose 
> wrote:
> >
> > Hello all,
> >
> > Fixes have been applied on top of RC0 and I would now like to call a vote
> > for RC1. RC1 fixes the following issues with RC0:
> > - Docs are now included in the release tarball.
> > - HDDS-5908 has been included to fix a potential read failure for multi
> > part upload keys.
> > - HDDS-5933 has been included to fix a dependency issue that was causing
> an
> > unnecessary size increase in the release tarball.
> >
> > 472 Jiras were resolved as part of this release. RC0 fixed 468 jiras.
> Each
> > of the above mentioned jiras had an original fix jira and a cherry pick
> > jira for the release branch, resulting in the increase of 4 fixes over
> RC0.
> >
> https://issues.apache.org/jira/browse/HDDS-5893?jql=project%20%3D%20HDDS%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%201.2.0
> >
> > The RC1 tag can be found on Github at:
> > https://github.com/apache/ozone/releases/tag/ozone-1.2.0-RC1
> >
> > The source and binary tarballs can be found at:
> > https://dist.apache.org/repos/dist/dev/ozone/1.2.0-rc1/
> >
> > Maven artifacts are staged at:
> > https://repository.apache.org/content/repositories/orgapacheozone-1002/
> >
> > The public key used to sign the artifacts can be found at:
> > https://dist.apache.org/repos/dist/dev/ozone/KEYS
> >
> > The fingerprint of the key used to sign the artifacts is:
> > 00D357B731EBDD66D7CA91B39146CE06B541B7E7
> >
> > Thanks,
> > Ethan Rose
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> For additional commands, e-mail: dev-h...@ozone.apache.org
>
>


[VOTE] Apache Ozone 1.2.0 RC2

2021-11-10 Thread Ethan Rose
Hello all,

The issue with `ozone version` output has been fixed and I would like to
call a vote for RC2.

473 Jiras were resolved as part of this release. This is one more than RC1
because the Jira for HDDS-5062 had an incorrect fix version which has been
corrected.
https://issues.apache.org/jira/browse/HDDS-5893?jql=project%20%3D%20HDDS%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%201.2.0

The RC2 tag can be found on Github at:
https://github.com/apache/ozone/releases/tag/ozone-1.2.0-RC2

The source and binary tarballs can be found at:
https://dist.apache.org/repos/dist/dev/ozone/1.2.0-rc2/

Maven artifacts are staged at:
https://repository.apache.org/content/repositories/orgapacheozone-1003/

The public key used to sign the artifacts can be found at:
https://dist.apache.org/repos/dist/dev/ozone/KEYS

The fingerprint of the key used to sign the artifacts is:
00D357B731EBDD66D7CA91B39146CE06B541B7E7

Thanks,
Ethan Rose


Re: [VOTE] Apache Ozone 1.2.0 RC2

2021-11-17 Thread Ethan Rose
Hi All,


The voting has ended with:

5binding +1 (including me)

5 non-binding +1

0 -1

0 0


With above data, Ozone 1.2.0 RC2 voting is passed.

Thank you all for your verifying and voting effort.


I will proceed with releasing the artifacts and announce the release after

that.

On Tue, Nov 16, 2021 at 4:00 PM Neil Joshi  wrote:

> +1
>
> Thanks Ethan.
>
> Regards,
> Neil
>
> On Wed, Nov 10, 2021 at 11:38 AM Ethan Rose 
> wrote:
>
> > Hello all,
> >
> > The issue with `ozone version` output has been fixed and I would like to
> > call a vote for RC2.
> >
> > 473 Jiras were resolved as part of this release. This is one more than
> RC1
> > because the Jira for HDDS-5062 had an incorrect fix version which has
> been
> > corrected.
> >
> >
> https://issues.apache.org/jira/browse/HDDS-5893?jql=project%20%3D%20HDDS%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%201.2.0
> >
> > The RC2 tag can be found on Github at:
> > https://github.com/apache/ozone/releases/tag/ozone-1.2.0-RC2
> >
> > The source and binary tarballs can be found at:
> > https://dist.apache.org/repos/dist/dev/ozone/1.2.0-rc2/
> >
> > Maven artifacts are staged at:
> > https://repository.apache.org/content/repositories/orgapacheozone-1003/
> >
> > The public key used to sign the artifacts can be found at:
> > https://dist.apache.org/repos/dist/dev/ozone/KEYS
> >
> > The fingerprint of the key used to sign the artifacts is:
> > 00D357B731EBDD66D7CA91B39146CE06B541B7E7
> >
> > Thanks,
> > Ethan Rose
> >
>
>
> --
> NJ
>


[Announce] Apache Ozone 1.2.0 Release

2021-11-18 Thread Ethan Rose
Hello all,

I am happy to announce that Apache Ozone 1.2.0 has been released.

Release details and links to downloads are on the 1.2.0 announcement page:
https://ozone.apache.org/release/1.2.0/

Download links are also available from the downloads page:
https://ozone.apache.org/downloads/

Versioned documentation is available on the documentation page:
https://ozone.apache.org/docs/

Thank you all for your contributions!

Ethan


[Announce] Apache Ozone 1.2.1 Release

2021-12-22 Thread Ethan Rose
Hello all,

I am happy to announce the release of Apache Ozone 1.2.1. This release
updates Ozone's log4j version to 2.16. We are still working to update the
website.

The source and binary tarballs can be downloaded from:
https://dlcdn.apache.org/ozone/1.2.1/
or
https://downloads.apache.org/ozone/1.2.1/

Checksum and signature files can also be downloaded from:
https://downloads.apache.org/ozone/1.2.1/

Apologies for the delay in updating the website, we hope to have the links
above available there soon as well.

Ethan


Re: [DISCUSS] 1.3.0 release

2022-04-22 Thread Ethan Rose
Hi MingChao, thanks for driving this effort.

We have a few release blockers on the upgrade side that need to be resolved
before we can release:
- SCM HA finalization (I have begun working on this): HDDS-5141
- Onboarding FSO into the upgrade framework (this can start now that EC has
been onboarded): HDDS-6040
- Client cross compatibility for FSO (I'm not sure if we have a separate
Jira for this are planning on using HDDS-6040 for this as well)

We should also clarify the state of the container balancer before we
release 1.3.0. Work in that area has been taking place on the master
branch, so the release will contain it. We need to document whether this is
stable enough to use or should be considered beta quality.

On Fri, Apr 22, 2022 at 5:35 AM guimark  wrote:

> I think we could mark EC as a [tech preview] feature clearly in this 1.3.0
> release.
>
> And we could release EC as a completed feature in the next release if
> possible.
>
>
>
>
>
>
>
>
>
>
>
> At 2022-04-22 20:26:07, "Kota Uenishi"  wrote:
> >I would welcome 1.3 release even without EC available. This is because
> >a lot of other features and fixes I need in our system. For example,
> >HDDS-5881, HDDS-5461, HDDS-5656, HDDS-5975, HDDS-6321 and such.
> >Delivering them would be very valuable.
> >As RocksDB crash is also happening in our cluster intermittently, so I
> >also bet some of my pennies onto updating RocksDB to 7.0.4 with some
> >hope.
> >
> >EC is not in the road map planed after 1.2 release, too [1]. But I
> >agree that it can be de-emphasized, and EC readiness can be announced
> >in some later version like 1.4.0.
> >
> >[1] https://cwiki.apache.org/confluence/display/OZONE/Ozone+Roadmap
> >
> >On Fri, Apr 22, 2022 at 8:17 PM Kaijie Chen 
> wrote:
> >>
> >> I think 1.2.2 sounds like a bug fix version.If we are going to release
> a new feature version, 1.3.0 would be the proper name.  On 星期五, 22 四月
> 2022 19:13:18 +0800  captain...@apache.org  wrote Thanks @Stephen for
> your feedback.
> >>
> >> Maybe we can de-emphasize the EC in this version. If EC recovery is
> >> completed, it will take until the second half of the year.
> >> It's been a little long since the last release.  Since last release,
> FSO,
> >> S3gateway, OM, container Balancer have made some
> >> optimizations and bug fixes.  We could even release a small version this
> >> time, like 1.2.2.  We can release 1.3 next time.
> >>
> >> On Fri, Apr 22, 2022 at 5:43 PM Stephen O'Donnell
> >>  wrote:
> >>
> >> > My feeling is that it may be worth waiting until the recovery side of
> EC is
> >> > working before releasing. As it stands, EC is not in a usable form -
> the
> >> > feature is half done. If we release 1.3.0 now, we cannot state EC is
> >> > available in it.
> >> >
> >> > On Fri, Apr 22, 2022 at 9:43 AM mingchao zhao 
> >> > wrote:
> >> >
> >> > > Dear all,
> >> > >
> >> > > It has been a few months since the Ozone 1.2.0 & 1.2.1 release, and
> we
> >> > have
> >> > > had a
> >> > > number of new features (EC)and some optimization(s3gateway) and some
> >> > > bug fixes
> >> > > for 1.2 that have been merged to master since then. I think this
> would
> >> > be a
> >> > > good time
> >> > > to work on the Ozone 1.3.0 release. Also, I volunteered to take up
> the
> >> > role
> >> > > of the
> >> > > Release Manager for this release.
> >> > >
> >> > > Please let me know what you think.
> >> > >
> >> > >
> >> > > --
> >> > > Thanks & Regards,
> >> > > MingChao
> >> > >
> >> >
> >
> >
> >
> >--
> >--
> >Kota UENISHI, Engineer
> >
> >-
> >To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> >For additional commands, e-mail: dev-h...@ozone.apache.org
>


Re: [DISCUSS] 1.3.0 release

2022-04-22 Thread Ethan Rose
Also a reminder that there is still a merge vote for HDDS-4440 ongoing as
well. It may be best to resolve that before discussing a release.

On Fri, Apr 22, 2022 at 7:32 AM Ethan Rose  wrote:

> Hi MingChao, thanks for driving this effort.
>
> We have a few release blockers on the upgrade side that need to be
> resolved before we can release:
> - SCM HA finalization (I have begun working on this): HDDS-5141
> - Onboarding FSO into the upgrade framework (this can start now that EC
> has been onboarded): HDDS-6040
> - Client cross compatibility for FSO (I'm not sure if we have a separate
> Jira for this are planning on using HDDS-6040 for this as well)
>
> We should also clarify the state of the container balancer before we
> release 1.3.0. Work in that area has been taking place on the master
> branch, so the release will contain it. We need to document whether this is
> stable enough to use or should be considered beta quality.
>
> On Fri, Apr 22, 2022 at 5:35 AM guimark  wrote:
>
>> I think we could mark EC as a [tech preview] feature clearly in this
>> 1.3.0 release.
>>
>> And we could release EC as a completed feature in the next release if
>> possible.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> At 2022-04-22 20:26:07, "Kota Uenishi"  wrote:
>> >I would welcome 1.3 release even without EC available. This is because
>> >a lot of other features and fixes I need in our system. For example,
>> >HDDS-5881, HDDS-5461, HDDS-5656, HDDS-5975, HDDS-6321 and such.
>> >Delivering them would be very valuable.
>> >As RocksDB crash is also happening in our cluster intermittently, so I
>> >also bet some of my pennies onto updating RocksDB to 7.0.4 with some
>> >hope.
>> >
>> >EC is not in the road map planed after 1.2 release, too [1]. But I
>> >agree that it can be de-emphasized, and EC readiness can be announced
>> >in some later version like 1.4.0.
>> >
>> >[1] https://cwiki.apache.org/confluence/display/OZONE/Ozone+Roadmap
>> >
>> >On Fri, Apr 22, 2022 at 8:17 PM Kaijie Chen 
>> wrote:
>> >>
>> >> I think 1.2.2 sounds like a bug fix version.If we are going to release
>> a new feature version, 1.3.0 would be the proper name.  On 星期五, 22 四月
>> 2022 19:13:18 +0800  captain...@apache.org  wrote Thanks @Stephen
>> for your feedback.
>> >>
>> >> Maybe we can de-emphasize the EC in this version. If EC recovery is
>> >> completed, it will take until the second half of the year.
>> >> It's been a little long since the last release.  Since last release,
>> FSO,
>> >> S3gateway, OM, container Balancer have made some
>> >> optimizations and bug fixes.  We could even release a small version
>> this
>> >> time, like 1.2.2.  We can release 1.3 next time.
>> >>
>> >> On Fri, Apr 22, 2022 at 5:43 PM Stephen O'Donnell
>> >>  wrote:
>> >>
>> >> > My feeling is that it may be worth waiting until the recovery side
>> of EC is
>> >> > working before releasing. As it stands, EC is not in a usable form -
>> the
>> >> > feature is half done. If we release 1.3.0 now, we cannot state EC is
>> >> > available in it.
>> >> >
>> >> > On Fri, Apr 22, 2022 at 9:43 AM mingchao zhao > >
>> >> > wrote:
>> >> >
>> >> > > Dear all,
>> >> > >
>> >> > > It has been a few months since the Ozone 1.2.0 & 1.2.1 release,
>> and we
>> >> > have
>> >> > > had a
>> >> > > number of new features (EC)and some optimization(s3gateway) and
>> some
>> >> > > bug fixes
>> >> > > for 1.2 that have been merged to master since then. I think this
>> would
>> >> > be a
>> >> > > good time
>> >> > > to work on the Ozone 1.3.0 release. Also, I volunteered to take up
>> the
>> >> > role
>> >> > > of the
>> >> > > Release Manager for this release.
>> >> > >
>> >> > > Please let me know what you think.
>> >> > >
>> >> > >
>> >> > > --
>> >> > > Thanks & Regards,
>> >> > > MingChao
>> >> > >
>> >> >
>> >
>> >
>> >
>> >--
>> >--
>> >Kota UENISHI, Engineer
>> >
>> >-
>> >To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
>> >For additional commands, e-mail: dev-h...@ozone.apache.org
>>
>


Re: [VOTE] Merge Ozone S3 Multi-Tenancy feature branch (HDDS-4944) into master

2022-05-26 Thread Ethan Rose
+1

On Thu, May 26, 2022 at 2:06 PM Prashant Pogde 
wrote:

> +1
>
> > On May 23, 2022, at 8:40 PM, Siyao Meng  wrote:
> >
> > Dear Ozone devs,
> >
> >I am starting this vote thread for merging the Ozone S3 Multi-Tenancy
> > feature branch (HDDS-4944) to the master branch.
> >
> >  S3 multi-tenancy allows multiple S3-accessible volumes to be created.
> > Each volume can be managed separately by their own tenant admins via CLI
> > for tenant creation and user operations. Before S3 Multi-Tenancy, all S3
> > access to Ozone (via S3 Gateway) are
> > confined to a single designated S3 volume (s3v volume by default).
> >
> >  The feature has been in development for about 14 months now. Currently,
> > functions like tenant creation/deletion (along with volume and bucket
> > Ranger policy creation), user assign/revoke, tenant admin assign/revoke,
> > Ranger policies and roles synchronization background thread, global
> config
> > key to enable S3 Multi-Tenancy feature (disabled by default) are
> > implemented and tested working. Documentations are added as well.
> >
> >  The S3 multi-tenancy feature umbrella JIRA is HDDS-4944
> > .
> >
> >  We are very close to finishing the final patch (HDDS-6701
> >  in a week) that we
> deem
> > necessary before merging this feature to the master branch.
> >
> >  For more information (feature overview, Docker dev and production setup
> > guide, CLI guide and access control guide), please check out the S3
> > multi-tenancy feature wiki page here:
> >
> https://cwiki.apache.org/confluence/display/OZONE/S3+Multi-Tenancy+%28HDDS-4944%29+Merge+Checklist
> >
> >
> > Thanks!
> > Siyao
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> For additional commands, e-mail: dev-h...@ozone.apache.org
>
>


Re: [Internet][DISCUSS] 1.3.0 release

2022-07-27 Thread Ethan Rose
+1. I think we should fix the RocksDB memory leaks so we can increase to
version 7 in this release, but those can be cherry-picked on to the branch.

On Tue, Jul 26, 2022 at 7:48 PM baoloongmao(毛宝龙) 
wrote:

> +1, thank you.
>
>
>
>
>
>
>
>
>
>--回复的邮件信息--
>
> mingchao zhao 上午10:43写道:
>
>
> Dear all,
>
> It's been a long time since we talked about release in Email and Slack. So
> far I see that the block tasks listed before are almost complete.
>   Here are the block tasks we discussed before.
> <
> https://the-asf.slack.com/archives/C5RK7PWA1/p1655845645903989?thread_ts=1655801662.600279&cid=C5RK7PWA1>
> ;
>   Here is the 1.3.0 Release wiki page.
> ;
> I think this would be a good time to start work on the Ozone 1.3.0 release.
> If you agree, I will cut a branch and start the process. Maybe there still
> have small tasks and we can cherry-pick to the release branch when they
> finished.
>
> Please let me know what you think.
>
>
> --
> Thanks & Regards,
> MingChao


Re: [RFC] Proposal: Reserve Space for Allocated Blocks

2022-09-08 Thread Ethan Rose
I believe the flow is:
1. Datanode notices the container is near full.
2. Datanode sends close container action to SCM on its next heartbeat.
3. SCM closes the container and sends a close container command on the
heartbeat response.
4. Datanodes get the response and close the container. If it is a Ratis
container, the leader will send the close via Ratis.

There is a "grace period" of sorts between steps 1 and 2, but this does not
help the situation because SCM does not stop issuing blocks to this
container until after step 3. Perhaps some amount of pause between steps 3
and 4 would help, either on the SCM or datanode side. This would provide a
"grace period" between when SCM stops allocating blocks for the container
and when the container is actually closed. I'm not sure exactly how this
would be implemented in the code given the current setup, but it seems like
a simple option we should try before other more complicated solutions.

Ethan

On Thu, Sep 8, 2022 at 4:04 AM Kaijie Chen  wrote:

>  > Are you seeing this for Ratis writes or only EC? Have you changed the EC
>  > pipeline limit to a higher value than 5? I wonder if a lesser number of
>  > open write pipelines could contribute to this problem too.
>
> This exception is reproducible in both RATIS and EC.
>
> https://paste.ubuntu.com/p/NjpQ64PYfR/plain/
>
> EC pipeline limit was set to 30 in the previous email.
> Increasing pipeline will help, but it doesn't solve the problem from the
> root.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> For additional commands, e-mail: dev-h...@ozone.apache.org
>
>


Re: [VOTE] Apache Ozone 1.3.0 RC0

2022-11-15 Thread Ethan Rose
We recently had a PR  merged
that fixes SCM HA finalization using the incorrect RocksDB table to write
information. Since this is a new feature in 1.3.0 it seems like we should
cherry pick this to the release branch and make an RC1. MingChao and
others, what do you think?

Ethan

On Mon, Nov 14, 2022 at 5:55 AM mingchao zhao  wrote:

> Thanks Kaijie for your confirmation.
> Previously, we had upgraded ratis mainly to solve the memory leak of grpc.
> Ratis 2.4.1 does have some improvements,
> but it doesn't seem to have necessary bugs fixed for ozone.(Please let me
> know if there is anything patch we need)
>
> So we can just upgrade it in master branch. This way we will not block the
> release of ozone 1.3.0 with ratis.
>
> On Mon, Nov 14, 2022 at 8:45 PM Kaijie Chen  wrote:
>
> > +1, Thanks mingchao for the work.
> >
> > Verified:
> > 1. Checksum and signature matches.
> > 2. Successfully built from source.
> > 3. Commit hash of the binary matches.
> > 4. Checked LICENSE and NOTICE.
> > 5. Successfully deployed on a cluster.
> > 6. Run basic tests with freon.
> >
> > PS: Should we wait for Ratis 2.4.1 release?
> >
> > Kaijie
> >
> >   On Mon, 14 Nov 2022 19:54:43 +0800  mingchao zhao  wrote ---
> >  > Hello all,
> >  >
> >  > As discussed earlier, I am calling for a vote on Apache Ozone 1.3.0
> RC0.
> >  >
> >  > 735 Jiras were resolved as part of this release:
> >  >
> >
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20HDDS%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%201.3.0
> >  >
> >  > The RC0 tag can be found on Github at:
> >  > https://github.com/apache/ozone/releases/tag/ozone-1.3.0-RC0
> >  >
> >  > The source and binary tarballs can be found at:
> >  > https://dist.apache.org/repos/dist/dev/ozone/1.3.0-rc0/
> >  >
> >  > Maven artifacts are staged at:
> >  >
> https://repository.apache.org/content/repositories/orgapacheozone-1010/
> >  >
> >  > The public key used to sign the artifacts can be found at:
> >  > https://dist.apache.org/repos/dist/dev/ozone/KEYS
> >  >
> >  > The fingerprint of the key used to sign the artifacts is:
> >  > 579BA5230BBD258030A909601E3600CAA819185F
> >  >
> >  > The vote will run for 7 days, ending on Nov 21th 2022 at 19:52 pm
> UTC+8.
> >  >
> >  > --
> >  > Thanks
> >  > MingChao
> >  >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> > For additional commands, e-mail: dev-h...@ozone.apache.org
> >
> >
>
> --
> Thanks
> MingChao
>


Re: [VOTE] Apache Ozone 1.3.0 RC0

2022-11-16 Thread Ethan Rose
Hi all, sorry to crash this thread again, but I found another issue we may
want to look into for 1.3.0.  It turns out the default bucket layout type
is set to LEGACY, which means that most new ozone clusters will continue to
create legacy buckets if they are not aware bucket layouts have been added.
Additionally, if mkdir via ofs is used on a pre-finalized cluster and it
requires creating a bucket, this will fail because the default value
of ozone.client.fs.default.bucket.layout is set to FSO, but the
pre-finalized ozone cluster will not allow FSO buckets. The PR
<https://github.com/apache/ozone/pull/3966> to fix both of these things is
currently underway, but since it is basically changing the default bucket
type for all the tests, there are numerous failures I am still working
through.

Please let me know your thoughts on the severity of this issue and whether
it should be a blocker or not.

Ethan

On Wed, Nov 16, 2022 at 5:04 AM mingchao zhao  wrote:

> Thanks Ethan and Sammi for the tip.
> I will cherry pick this patch to ozone-1.3 and start a new rc.
>
> On Wed, Nov 16, 2022 at 10:29 AM Sammi Chen  wrote:
>
> > Agree.
> >
> > https://github.com/apache/ozone/pull/3961 fixed a severe problem.
> >
> > We'd better have it in 1.3.0 release.
> >
> > Bests,
> > Sammi
> >
> > On Wed, 16 Nov 2022 at 02:42, Ethan Rose 
> > wrote:
> >
> > > We recently had a PR <https://github.com/apache/ozone/pull/3961>
> merged
> > > that fixes SCM HA finalization using the incorrect RocksDB table to
> write
> > > information. Since this is a new feature in 1.3.0 it seems like we
> should
> > > cherry pick this to the release branch and make an RC1. MingChao and
> > > others, what do you think?
> > >
> > > Ethan
> > >
> > > On Mon, Nov 14, 2022 at 5:55 AM mingchao zhao 
> > > wrote:
> > >
> > > > Thanks Kaijie for your confirmation.
> > > > Previously, we had upgraded ratis mainly to solve the memory leak of
> > > grpc.
> > > > Ratis 2.4.1 does have some improvements,
> > > > but it doesn't seem to have necessary bugs fixed for ozone.(Please
> let
> > me
> > > > know if there is anything patch we need)
> > > >
> > > > So we can just upgrade it in master branch. This way we will not
> block
> > > the
> > > > release of ozone 1.3.0 with ratis.
> > > >
> > > > On Mon, Nov 14, 2022 at 8:45 PM Kaijie Chen  wrote:
> > > >
> > > > > +1, Thanks mingchao for the work.
> > > > >
> > > > > Verified:
> > > > > 1. Checksum and signature matches.
> > > > > 2. Successfully built from source.
> > > > > 3. Commit hash of the binary matches.
> > > > > 4. Checked LICENSE and NOTICE.
> > > > > 5. Successfully deployed on a cluster.
> > > > > 6. Run basic tests with freon.
> > > > >
> > > > > PS: Should we wait for Ratis 2.4.1 release?
> > > > >
> > > > > Kaijie
> > > > >
> > > > >   On Mon, 14 Nov 2022 19:54:43 +0800  mingchao zhao  wrote ---
> > > > >  > Hello all,
> > > > >  >
> > > > >  > As discussed earlier, I am calling for a vote on Apache Ozone
> > 1.3.0
> > > > RC0.
> > > > >  >
> > > > >  > 735 Jiras were resolved as part of this release:
> > > > >  >
> > > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20HDDS%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%201.3.0
> > > > >  >
> > > > >  > The RC0 tag can be found on Github at:
> > > > >  > https://github.com/apache/ozone/releases/tag/ozone-1.3.0-RC0
> > > > >  >
> > > > >  > The source and binary tarballs can be found at:
> > > > >  > https://dist.apache.org/repos/dist/dev/ozone/1.3.0-rc0/
> > > > >  >
> > > > >  > Maven artifacts are staged at:
> > > > >  >
> > > >
> > https://repository.apache.org/content/repositories/orgapacheozone-1010/
> > > > >  >
> > > > >  > The public key used to sign the artifacts can be found at:
> > > > >  > https://dist.apache.org/repos/dist/dev/ozone/KEYS
> > > > >  >
> > > > >  > The fingerprint of the key used to sign the artifacts is:
> > > > >  > 579BA5230BBD258030A909601E3600CAA819185F
> > > > >  >
> > > > >  > The vote will run for 7 days, ending on Nov 21th 2022 at 19:52
> pm
> > > > UTC+8.
> > > > >  >
> > > > >  > --
> > > > >  > Thanks
> > > > >  > MingChao
> > > > >  >
> > > > >
> > > > >
> -
> > > > > To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> > > > > For additional commands, e-mail: dev-h...@ozone.apache.org
> > > > >
> > > > >
> > > >
> > > > --
> > > > Thanks
> > > > MingChao
> > > >
> > >
> >
>
>
> --
> Thanks
> MingChao
>


Re: Past Ozone Events

2022-12-16 Thread Ethan Rose
Thanks for adding this Wei-Chiu, this is a good thing to have as we start
giving more ozone talks.

On Fri, Dec 16, 2022 at 3:58 PM Wei-Chiu Chuang  wrote:

> Hi community,
>
> I opened a PR  (https://github.com/apache/ozone-site/pull/29) to add a web
> page to link past Ozone events and conference talks.
>
> Let me know if you have given a talk in the past. I can help add to the
> page. Best if you have links to recordings or slide deck.
>
> This is the rendered output so far:
>
> Ozone EventsPast Meetups
>
> The first ever Apache Ozone User Group Summit was held at Cloudera’s
> headquarter on Nov 10, 2022. This event was live streamed on LinkedIn and
> YouTube. Meetup link
> <https://www.meetup.com/futureofdata-siliconvalley/events/289201001/>
> YouTube
> recording <https://www.youtube.com/watch?v=3aEpeSXMMzw>
>
> The following was the list of agenda:
> Bucket types and FSO improvements
>
> Speaker: Ethan Rose
>
> Recent release of the Cloudera Data Platform 7.1.8 shipped two key feature
> improvements for Apache Ozone - ability to create specialized buckets:
> object store (OBS) and file system optimized (FSO). The special FSO-enabled
> bucket now supports atomic renames on file system objects; required for
> data warehousing workloads like Apache Hive queries to perform efficiently
> and correctly on underlying storage. Come join us to learn more about how
> Ozone can now support diverse workloads in cloud-native environment: single
> storage system with S3 as well as hierarchical file-system capabilities.
> Apache Ozone snapshots - new design
>
> Speakers: Prashant Pogde, Siyao Meng
>
> Apache Ozone snapshots is a critical innovation currently driven by
> Cloudera with the active Apache Ozone community. This talk covers the early
> design goals, architecture of snapshots and results from an early POC.
> Ozone aims to uniquely provide an object storage solution that enables a
> consistent point in time view of the namespace with instantaneous snapshot
> capability and very efficient linear time snapshot diff feature to find out
> what has changed in your system in between snapshots. Come join us to learn
> more about how we propose to achieve this.
> Apache Ozone Performance
>
> Speakers: Ritesh Shukla
>
> Apache Ozone is a modern object storage that uniquely supports a native S3
> interface as well as a Hadoop compatible file system interface. Ozone’s
> architecture is designed to meet the high performance requirements of
> diverse workloads while being able to scale to billions of objects and 100s
> of petabytes of dense distributed storage nodes. The Apache Ozone community
> invested a significant amount of time to improve performance, both
> throughput and latency for metadata and data. Moreover, Apache Ozone is
> built to take advantage of modern storage innovations like NVMe. This talk
> will provide insights into such improvements while sharing test results for
> well known benchmarks like TPC-DS.
> Past Conference Talks2022
>
>- ApacheCon North America 2022: Reduce Your Storage Footprint with
>Apache Ozone Erasure Coding. Uma Maheswara Rao Gangumalla.
>- ApacheCon North America 2022: Inside an Apache Ozone Upgrade. Ethan
>Rose.
>- ApacheCon North America 2022: Apache Ozone - State of the Union. Siyao
>Meng, Ethan Rose.
>- ApacheCon North America 2022: Performance of Apache Ozone on NVMe.
>Wei-Chiu Chuang, Ritesh Shukla.
>- China Apache Hadoop Meetup 2022: What’s new in Apache Ozone 1.3. Sammi
>Chen.
>- ApacheCon Asia 2022: Sharing Of Recent Progress And Practices In
>Apache Ozone <https://www.youtube.com/watch?v=SB4lgATn-s8>. Yan Liu,
>Sammi Chen.
>- ApacheCon Asia 2022: Disaster Recovery In Apache Ozone
><https://www.youtube.com/watch?v=E97fYFZJ2LQ>. Sadanand Shenoy, Rakesh
>Radhakrishnan.
>- ApacheCon Asia 2022: Apache Ozone Behind Simulation And Ai Industries
><https://www.youtube.com/watch?v=EmpHluBOesg>. Kota Uenishi.
>- ApacheCon Asia 2022: Apache Ozone: Multi-Protocol Aware System Handles
>Both Files And Objects Efficiently
><https://www.youtube.com/watch?v=HN7PWX9TdAE>. Radhakrishnan Rakesh,
>Singh Mukul Kumar.
>- SDC India 2022: Apache Ozone: Multi-protocol aware system handles both
>files and objects efficiently
><https://www.youtube.com/watch?v=lzPrL2I_2VU>. Radhakrishnan Rakesh,
>Singh Mukul Kumar.
>
> 2021
>
>- ApacheCon@Home 2021: Secure Apache Ozone with High Availability
><https://www.youtube.com/watch?v=tGEjS4lSbbY>. Bharat Viswanadham,
>Xiaoyu Yao.
>- ApacheCon@Home 2021: Apache Ozone - State of the Union
><https://www.youtub

Re: [DISCUSS] Proposal to merge HDDS-6517 Ozone Snapshots

2023-01-11 Thread Ethan Rose
Hey Prashant, thanks for all the work on this feature. I think the
confluence page needs a few updates for completeness.
>
> *2. documentation*
>
> Documentaion can be found under the Attachments section in the umbrella
> jira: HDDS-6517  - Snapshot
> support for Ozone OPEN
>

This bullet refers to user facing documentation to be added to
https://ozone.apache.org/docs/current/. It is not the same as "3. design,
attached the docs". If the docs are not done yet, I think a link to an open
jira would be sufficient.

> *9. possible incompatible changes/used feature flag: *
>
> There should not be any incompatible changes introduced with this feature.
>
> An enable/disable switch is not required. Snapshot feature has its
> independent se of CLIs that need not be exercised.
>

I'm not sure what the merge template intends for the difference between 9
and 4 is. Anyways, it looks like snapshots will need a layout feature to
make sure it is not used before Ozone has been finalized. Again, I think
linking to an open jira would be sufficient here. If it can be done without
a layout feature somehow, an explanation in this part of the page would be
good.

With these minor updates I think the merge would be good to go.

Ethan


On Tue, Jan 10, 2023 at 2:56 PM Prashant Pogde 
wrote:

> Dear Ozone Devs,
>
> We would like to start this discussion thread for the proposal to merge
> https://issues.apache.org/jira/browse/HDDS-6517 Ozone Snapshots.
>
> Ozone Snapshot feature allows us to take a point in time frozen image of
> the Ozone dataset. Initially this feature would allow snapshots to be taken
> at bucket granularity. Subsequently the feature can support Snapshots
> at Volume level as well as the whole Ozone Namespace level.
>
> Note that HDDS-6517 is a new feature such that it is mainly adding new
> code.  Therefore, the risk is very low to the existing functionalities and
> features in Ozone.
>
> When the Snapshot feature is not used, it should cause no impact to the
> master.
> For more information, please check out Ozone Snapshot feature Jira
> (HDDS-6517)  and feature wiki page here:
>
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+Snapshots
>
> If there are no objections for the merge, we could start the official vote.
>
> Thanks,
> Ozone Snapshot Team


Re: [DISCUSS] Proposal to merge HDDS-6517 Ozone Snapshots

2023-01-11 Thread Ethan Rose
Thanks for the updates Prashant. The wiki LGTM so I am +1 to proceed with a
vote.

On Wed, Jan 11, 2023 at 12:49 PM Prashant Pogde 
wrote:

> Thanks for the feedback Ethan, Uma and Sid. I have updated the template on
> the wiki with Jira tickets to address these concerns.
>
> > On Jan 11, 2023, at 11:23 AM, Uma Maheswara Rao Gangumalla <
> umaganguma...@gmail.com> wrote:
> >
> > https://issues.apache.org/jira/browse/HDDS-6851 <
> https://issues.apache.org/jira/browse/HDDS-6851>
>


Re: [DISCUSS] Proposal to merge HttpFS support in Ozone feature branch (HDDS-5447-httpfs) into master

2023-02-06 Thread Ethan Rose
+1 for merge, thanks Zita and Pifta!

- Ethan

On Mon, Feb 6, 2023 at 2:58 PM Zita Dombi  wrote:

> Hi Ozone Devs!
>
> I am starting this discussion thread for proposing to merge the HttpFS
> support in Ozone feature branch (HDDS-5447
> ) to the master branch.
>
> Ozone HttpFS is a WebHDFS compatible interface implementation, as a
> separate role it provides an easy integration with Ozone via REST API. It
> started with forking the existing HDFS HttpFS endpoint implementation,
> after that we solved some dependency issues and got rid of some of them. We
> removed the client side code of it, added handling for the (currently)
> unsupported operations. Integration tests were added for the ozone(secure)
> and ozone(secure)-ha environments.
>
> HDDS-5447 is a new feature, which added mainly new code to the project,
> there were no changes in the current behaviour. This implies that the risk
> is low that it affects or brokes something from the existing
> functionalities and features.
>
> There is one ongoing task, which is updating the module documentation and
> adding a proper page for the Ozone HttpFS, which is tracked under HDDS-5966
> .
>
> More information can be found on the wiki page:
>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240883091
>
> If there are no objections for the merge, we could start the official vote.
>
> Thanks,
>
> Zita and Pifta
>


Re: Cutting Ozone 1.3.1

2023-03-14 Thread Ethan Rose
+1 for a patch release with all the fixes previously mentioned. In
addition, these bugs related to FSO deletion were fixed since 1.3.0 and
should be added:
HDDS-7541
HDDS-7592

On Tue, Mar 14, 2023 at 7:53 AM Siddharth Wagle  wrote:

> +1 for 1.3.1 with critical fixes.
>
> Best,
> Sid
>
> On Tue, Mar 14, 2023 at 12:53 AM Kota Uenishi  wrote:
>
> > Hi,
> >
> > +1 (non-binding) from me.
> >
> > I'd also like to have HDDS-7755 included in the release. It's a tiny
> > fix, but fixed severe hang in our cluster, which was the original
> > source of HDDS-7701.
> > We already deployed it to our cluster with our internal distribution
> > by cherry-pick [1], but we'd appreciate a lot if it's in the official
> > release.
> >
> > [1]
> >
> https://github.com/pfnet/ozone/commit/100edac98a17523cee3451a7086250e08e72ce41
> > --
> > Kota UENISHI, Engineer
> > (+81)80-9299-2656 Preferred Networks, Inc.
> >
> > On Tue, Mar 14, 2023 at 4:03 PM Kaijie Chen  wrote:
> > >
> > > +1, sound good
> > >
> > > I would like to propose two more fixes to the list.
> > > A FS contract bug caused by HDDS-7253 is fixed by them.
> > >
> > > https://issues.apache.org/jira/browse/HDDS-7871
> > > https://issues.apache.org/jira/browse/HDDS-7991
> > >
> > > Best,
> > > Kaijie
> > >
> > >
> > >   On Tue, 14 Mar 2023 14:52:38 +0800  Ritesh Shukla  wrote ---
> > >  > Folks,
> > >  > As part of the debug for HDDS-7701, some bugs were fixed, and
> > defensive
> > >  > code changes were put in.
> > >  > There were multiple bug fixes around block deletion execution.
> > >  > In addition to that, multiple client leak fixes have been checked
> > into the
> > >  > master branch.
> > >  > I recommend cherry-picking these critical fixes to the 1.3 branches
> > and
> > >  > cutting a new release.
> > >  > Regards,
> > >  > Ritesh
> > >  >
> > >  > List:
> > >  > https://issues.apache.org/jira/browse/HDDS-8142
> > >  > https://issues.apache.org/jira/browse/HDDS-8118
> > >  > https://issues.apache.org/jira/browse/HDDS-8129
> > >  > https://issues.apache.org/jira/browse/HDDS-8108
> > >  > https://issues.apache.org/jira/browse/HDDS-8019
> > >  > https://issues.apache.org/jira/browse/HDDS-7156
> > >  >  https://issues.apache.org/jira/browse/HDDS-8044
> > >  > https://issues.apache.org/jira/browse/HDDS-7091
> > >  > https://issues.apache.org/jira/browse/HDDS-7931
> > >  > https://issues.apache.org/jira/browse/HDDS-8020
> > >  > https://issues.apache.org/jira/browse/HDDS-7306
> > >  > https://issues.apache.org/jira/browse/HDDS-7126
> > >  > https://issues.apache.org/jira/browse/HDDS-7091
> > >  >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> > > For additional commands, e-mail: dev-h...@ozone.apache.org
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> > For additional commands, e-mail: dev-h...@ozone.apache.org
> >
> >
>


Re: Proposal to enable GitHub discussions

2023-04-24 Thread Ethan Rose
+1. Ritesh, I've seen you messing around with this on your Ozone fork. If
you have it in a presentable state, could you share the link here so others
can reference an example of how we could set up the sections to organize
questions, meeting minutes, etc?

Ethan

On Sun, Apr 23, 2023 at 11:12 PM Ritesh Shukla 
wrote:

> Hello folks,
>
> Sending this across for a couple of reasons
> 1. Slack and Jiras are invitations only for now
> 2. Slack messages are not google friendly.
> GitHub allows discussions Ref:
> https://github.com/kerneltime/ozone/discussions
> I propose enabling GitHub discussions to ease the outreach to curious
> community folks who want a low-overhead means to reach out.
>
> I will discuss this in the community sync tomorrow. We can also post the
> meeting minutes and recording details in the discussions section.
>
> Regards,
> Ritesh
>


Re: [VOTE] Proposal to merge Ozone feature branch HDDS-7733-Symmetric-Tokens to master

2023-06-02 Thread Ethan Rose
+1 for the merge

Ethan

On Wed, May 31, 2023 at 7:31 PM Sumit Agrawal
 wrote:

> Hi Duong,
>
> Thanks for working on this.
>
> Binding +1
>
> Regards
> Sumit
>
> On Thu, Jun 1, 2023 at 6:10 AM Duong Nguyen  wrote:
>
> > Hi Ozone Devs,
> >
> > Following the discussion thread regarding the same topic, I'm starting
> this
> > vote thread for merging the feature branch HDDS-7733-Symmetric-Tokens to
> > master.
> > The discussion thread
> >  can
> be
> > found here.
> >
> > This feature branch contains the implementation to replace the costly
> token
> > signature generation using asymmetric (RSA) keys with symmetric key
> > algorithms, like HMAC with SHA256. Symmetric key algorithms bring a
> > much better performance and are the natural fit for Ozone token use case.
> > Yet, they require building a mechanism to generate, store, distribute,
> and
> > renew symmetric secret keys. That requirement is not trivial and has to
> be
> > split into smaller tasks that cannot be shipped individually. That is
> > the reason why the implementation of HDDS-7733
> >  happens in a separate
> > feature branch.
> >
> > More information can be found on the wiki page:
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=255070328
> >
> > Thanks,
> > Duong
> >
>
>
> --
> *Sumit Agrawal* | Senior Staff Engineer
> cloudera.com 
> [image: Cloudera] 
> [image: Cloudera on Twitter]  [image:
> Cloudera on Facebook]  [image: Cloudera
> on LinkedIn] 
> --
>


[Proposal] Datanode container log for tracking container replica state changes

2023-06-05 Thread Ethan Rose
Hi Ozone Devs,

I am currently working on  HDDS-7364
 to get Ozone's container
scanner to a point where it can be enabled by default. The container
scanner will check container block data and metadata in the background to
identify corruption, mark containers unhealthy, and notify SCM so a healthy
replica can be copied.

One of the subtasks is HDDS-8062
, which is to provide a
way to track why containers were marked unhealthy, and persist that
information so it can be referenced a while later. Datanode application
logs can roll too frequently for this purpose, so I propose adding a new
log to the datanode to track container replica state transitions. This log
would provide useful debugging insight not just for the scanner, but for
any other replica related issues that may originate on the datanodes. The
design doc is attached to HDDS-8062
 and here

is a link as well.

This will add a new log and new debugging capabilities to Ozone. Please
review and provide any feedback on this thread or the jira.

Thanks.
Ethan


Re: [DISCUSS] Next Ozone major release 1.4.0

2023-06-07 Thread Ethan Rose
+1 for a 1.4.0 release. We had hoped for a 1.3.1 release but it looks like
that never gained the momentum it needed. It would be good to hear from
devs working on larger features whether or not they would like to target
this release. Things like snapshots, cert rotation, hsync, recon
improvements etc. From my end I'm planning to have v1of the container
scanner done by the end of this month. It would be nice to have in the next
release but not essential.

Also worth noting that doing a major release will lock in protos and disk
structures as they are on master for compatibility reasons. We have tried
to keep the master branch always releasable, but just an FYI to devs
working on larger tasks on master right now.

Ethan

On Wed, Jun 7, 2023 at 1:44 AM Sammi Chen  wrote:

> Dear Ozone Devs,
>
> It's been almost 5 month since the last major 1.3.0 release in Dec 2022.
>
> In the past 5 month, there have been a lot of issues fixed and improvements
> made, together with new features, like httpFs, scm decommission, Recon new
> functions, snapshot, EC balancer support, the the coming symmetric block
> token and certificate rotation, etc.
>
> So far, there are already 733 JIRAs resolved on 1.4.0.
>
> https://issues.apache.org/jira/issues/?jql=project+%3D+HDDS+and+fixVersion+%3D+1.4.0
>
> Usually it will take months to do a major release. So I propose to start
> the discussion of 1.4.0 now, things like which features should be involved
> in this release.
>
> And this release doesn't have a Release Manager yet.  Welcome
> anyone(Comitters or PMCs, for it requires some privileges on Apache
> facilities to do the release) to volunteer to be the RM of 1.4.0.
>
>
> Regards,
> Sammi
>


Detecting disk/volume failures in Ozone

2023-06-15 Thread Ethan Rose
Hi Ozone devs,

I am currently working on HDDS-8782
 to improve the checks
that are run by Ozone's volume scanner. This is a thread that periodically
runs in the background of datanodes to check the health of volumes/disks
configured with hdds.datanode.dir and determine whether or not they have
failed. The existing checks need some improvement so I am looking for input
on a better implementation. The following aspects should be considered:
1. A health check that is too strict may fail the volume unnecessarily due
to intermittent IO failures. This would trigger alerts and replication when
they are not required.
2. A health check that is too lenient may take a long time to detect a disk
failure or miss it entirely. This leaves the data vulnerable as it will not
be replicated when it should.
3. The strictness of the check should be set with sensible defaults, but
allow configuration if required.
4. The reason for volume failure should be clearly returned for logging.

Ozone's master branch is currently using `DiskChecker#checkDir` from Hadoop
to assess disk health. This call only checks directory existence and
permissions, which can be cached, and is not a good indication of hardware
failure. There is also `DiskChecker#checkDirWithDiskIO`, which writes a
file and syncs it as part of the check. Even using this check has issues:
- In some cases booleans are used instead of exceptions, which masks the
cause of the error. Violates 4.
- Aspects like the size of the file to write back to the disk, the number
of files written, and the number of failures tolerated are not
configurable. Violates 3 and possibly 1 or 2 if the default values are not
good.
- The check does not read back the data written to check that the contents
match. Violates 2.

The code to implement such checks is simple, so I propose implementing our
own set of checks in Ozone for fine grained control. In general those
checks should probably contain at least these three aspects:
1. A check that the volume's directory exists.
2. A check that the datanode has rwx permission on the directory.
3. An IO operation consisting of the following steps:
  1. Write x bytes to a file.
  2. Sync the file to the disk.
  3. Read x bytes back.
  4. Make sure the read bytes match what was written.
  5. Delete the file.

If either of the first two checks fail, the volume should be failed
immediately. More graceful handling of these errors is proposed in HDDS-8785
, but it is out of scope
of this current change. The third check is a bit more ambiguous. We have
the following options to adjust:
- The size of the file written.
- How many files are written as part of each volume scan.
  - One scan could read/write only one file, or it could do a few to
increase the odds of catching a problem.
- How frequently back-to-back scans of the same volume are allowed.
  - Since there is a background and on-demand volume scanner, there is a
"cool down" period between scans to prevent a volume from being repeatedly
scanned in a short period of time.
  - This is a good throttling mechanism, but if it is too high, it can slow
down failure detection if multiple scans are required to determine failure.
See the next point.
- How many failures must be encountered before failing the volume. These
failures could span multiple volume scans, or be contained in one scan
using repeated IO operations. Some options for this are:
  - Require x consecutive failures before the volume is failed. If there is
a success in between, the failure count is cleared.
  - Require a certain percent of the last x checks to fail before the
volume is unhealthy. For example, of the last 10 checks, at least 8 must
pass or the volume will be declared unhealthy.

FWIW I have a draft PR  out for
this that has some failure checks added, but I am not happy with them. They
currently require 3 consecutive scans to fail and leave the default volume
check gap at 15 minutes. This means you could have 66% IO failure rate and
still have a "healthy" disk. It could also take 45 minutes to determine if
there is a failure.

Disk failures are one of the primary motivations for using Ozone, so I
appreciate your insights in this key area.

Ethan


Re: Detecting disk/volume failures in Ozone

2023-06-20 Thread Ethan Rose
Hi Uma,

The datanode side checksums on write are still turned off. IO Exceptions on
the read/write path will trigger on demand container and volume/disk scans.
We could add containers to the on demand scanning queue after they are
closed for an initial scan, but this may place unnecessary burden on that
thread. Even this is still not a replacement for chunk level checksum
checks on write, since if all 3 replicas are corrupted during the write
process we cannot recover because the data is already committed. The
scanner would only identify the problem too late. For these reasons we
should work to a point where we can turn write checksums on by default.

To determine scanning priority, each container file has a timestamp stating
the last time it was scanned, which may have been never. The background
container data scanner is iterating over a list of containers sorted in
ascending order by last scanned timestamp, so those that were scanned
farthest in the past (or never scanned) will be scanned first. The iterator
is implemented as a ConcurrentSkipListMap which java defines as "weakly
consistent", so newly added containers may not show up until the scanner
finishes its existing iteration and obtains a new iterator. This is
probably for the best since it prevents write workloads from starving bit
rot detection on older data and helps us define an upper bound on the time
to scan a volume without having to worry about disruption from ongoing
writes.

Ethan

On Tue, Jun 20, 2023 at 11:32 AM Uma Maheswara Rao Gangumalla <
umaganguma...@gmail.com> wrote:

> Thank you Ethan for working on this important work.
> Looks like we are not enabled by default to validate the data checksums
> when writing. I am just thinking that we should validate data with
> priorities in background scanning?. Example: For files which did not get
> scanned before should be prioritized a bit aggressively compared to the
> data which was already scanned before. I know this might add some
> complexity, but let's think about this.
>
> For the other disk IO checking, if we can determine within 45 mins, that
> seems ok to me. The problem with more aggressive validations causes more IO
> and results in perf impact.
> Do we have a mechanism today to learn the disk issues from ongoing writes?
> Could IO exceptions be a trigger to validate the disk?
>
> Regards,
> Uma
>
> On Thu, Jun 15, 2023 at 3:46 PM Ethan Rose 
> wrote:
>
> > Hi Ozone devs,
> >
> > I am currently working on HDDS-8782
> > <https://issues.apache.org/jira/browse/HDDS-8782> to improve the checks
> > that are run by Ozone's volume scanner. This is a thread that
> periodically
> > runs in the background of datanodes to check the health of volumes/disks
> > configured with hdds.datanode.dir and determine whether or not they have
> > failed. The existing checks need some improvement so I am looking for
> input
> > on a better implementation. The following aspects should be considered:
> > 1. A health check that is too strict may fail the volume unnecessarily
> due
> > to intermittent IO failures. This would trigger alerts and replication
> when
> > they are not required.
> > 2. A health check that is too lenient may take a long time to detect a
> disk
> > failure or miss it entirely. This leaves the data vulnerable as it will
> not
> > be replicated when it should.
> > 3. The strictness of the check should be set with sensible defaults, but
> > allow configuration if required.
> > 4. The reason for volume failure should be clearly returned for logging.
> >
> > Ozone's master branch is currently using `DiskChecker#checkDir` from
> Hadoop
> > to assess disk health. This call only checks directory existence and
> > permissions, which can be cached, and is not a good indication of
> hardware
> > failure. There is also `DiskChecker#checkDirWithDiskIO`, which writes a
> > file and syncs it as part of the check. Even using this check has issues:
> > - In some cases booleans are used instead of exceptions, which masks the
> > cause of the error. Violates 4.
> > - Aspects like the size of the file to write back to the disk, the number
> > of files written, and the number of failures tolerated are not
> > configurable. Violates 3 and possibly 1 or 2 if the default values are
> not
> > good.
> > - The check does not read back the data written to check that the
> contents
> > match. Violates 2.
> >
> > The code to implement such checks is simple, so I propose implementing
> our
> > own set of checks in Ozone for fine grained control. In general those
> > checks should probably contain at least these three aspects:
> > 1. A check that the volume&#x

Re: Detecting disk/volume failures in Ozone

2023-06-21 Thread Ethan Rose
I totally agree. We need write checksums on by default, and I am not sure
the historical reason they were turned off when they were added in HDDS-5623
<https://issues.apache.org/jira/browse/HDDS-5623>. We should at least test
to quantify the performance difference of on vs off before we flip the
switch though.

Write checksums are a good topic for another thread, but they are
tangential to background disk failure detection which can happen after data
has already been written.

On Wed, Jun 21, 2023 at 8:27 AM Stephen O'Donnell
 wrote:

> Why is the write checksums validation not turned on by default? I have seen
> cases on HDFS where the "verify checksums on write" feature caught data
> corruption problems caused by faulty hardware / network cables before it
> was able to propagate into the system.
>
> The only reason I can think of for not enabling them would be write
> performance, but a small speed up in write should not be preferred over
> data integrity.
>
> On Wed, Jun 21, 2023 at 12:45 AM Ethan Rose 
> wrote:
>
> > Hi Uma,
> >
> > The datanode side checksums on write are still turned off. IO Exceptions
> on
> > the read/write path will trigger on demand container and volume/disk
> scans.
> > We could add containers to the on demand scanning queue after they are
> > closed for an initial scan, but this may place unnecessary burden on that
> > thread. Even this is still not a replacement for chunk level checksum
> > checks on write, since if all 3 replicas are corrupted during the write
> > process we cannot recover because the data is already committed. The
> > scanner would only identify the problem too late. For these reasons we
> > should work to a point where we can turn write checksums on by default.
> >
> > To determine scanning priority, each container file has a timestamp
> stating
> > the last time it was scanned, which may have been never. The background
> > container data scanner is iterating over a list of containers sorted in
> > ascending order by last scanned timestamp, so those that were scanned
> > farthest in the past (or never scanned) will be scanned first. The
> iterator
> > is implemented as a ConcurrentSkipListMap which java defines as "weakly
> > consistent", so newly added containers may not show up until the scanner
> > finishes its existing iteration and obtains a new iterator. This is
> > probably for the best since it prevents write workloads from starving bit
> > rot detection on older data and helps us define an upper bound on the
> time
> > to scan a volume without having to worry about disruption from ongoing
> > writes.
> >
> > Ethan
> >
> > On Tue, Jun 20, 2023 at 11:32 AM Uma Maheswara Rao Gangumalla <
> > umaganguma...@gmail.com> wrote:
> >
> > > Thank you Ethan for working on this important work.
> > > Looks like we are not enabled by default to validate the data checksums
> > > when writing. I am just thinking that we should validate data with
> > > priorities in background scanning?. Example: For files which did not
> get
> > > scanned before should be prioritized a bit aggressively compared to the
> > > data which was already scanned before. I know this might add some
> > > complexity, but let's think about this.
> > >
> > > For the other disk IO checking, if we can determine within 45 mins,
> that
> > > seems ok to me. The problem with more aggressive validations causes
> more
> > IO
> > > and results in perf impact.
> > > Do we have a mechanism today to learn the disk issues from ongoing
> > writes?
> > > Could IO exceptions be a trigger to validate the disk?
> > >
> > > Regards,
> > > Uma
> > >
> > > On Thu, Jun 15, 2023 at 3:46 PM Ethan Rose  >
> > > wrote:
> > >
> > > > Hi Ozone devs,
> > > >
> > > > I am currently working on HDDS-8782
> > > > <https://issues.apache.org/jira/browse/HDDS-8782> to improve the
> > checks
> > > > that are run by Ozone's volume scanner. This is a thread that
> > > periodically
> > > > runs in the background of datanodes to check the health of
> > volumes/disks
> > > > configured with hdds.datanode.dir and determine whether or not they
> > have
> > > > failed. The existing checks need some improvement so I am looking for
> > > input
> > > > on a better implementation. The following aspects should be
> considered:
> > > > 1. A health check that is too strict may fail the volume
> unnecess

Re: New committer: George Bijan Jahad

2023-06-27 Thread Ethan Rose
Congratulations George! Thanks for all your work on Ozone so far.

Ethan

On Tue, Jun 27, 2023 at 4:46 PM Michel Sumbul 
wrote:

> Wouhouuu, congratulations George
>
>
> Le mer. 28 juin 2023, 00:40, Wei-Chiu Chuang  a écrit
> :
>
> > The Project Management Committee (PMC) for Apache Ozone has invited
> > George Bijan Jahad (GitHub: GeorgeJahad, JIRA: georgeJahad) to
> > become a committer and we are pleased to announce that they have
> accepted.
> >
> > George helps build the Ozone Snapshot feature. During the past year
> > George contributed and merged 23 PRs and helped review 50 PRs.
> > Other than this, he has been very vocal in various  Ozone community
> > engagements. The impact to the project and the community is
> > noticeably visible.
> >
> > Being a committer enables easier contribution to the
> > project since there is no need to go via the patch
> > submission process. This should enable better productivity.
> > A PMC member helps manage and guide the direction of the project.
> >
> > Regards,
> > Weichiu
> >
>


Re: [DISCUSS] Please limit PR size

2023-07-26 Thread Ethan Rose
I've definitely been on both sides: posting a large PR myself, and as a
reviewer asking for PRs to be separated into smaller pieces. If we want
something persisted beyond an email thread I think we could update
CONTRIBUTING.md with a new section about how to post larger changes. Tips
for breaking down PRs that we could document:
- Separating refactoring PRs from the new change that depends on the
refactoring
- Splitting one refactoring of a general area of code into individual
refactoring PRs to iterate from the current state to the desired state.
- Use parent jiras with subtasks as part of planning, before coding.
- Provide design docs or at least detailed PR descriptions to make complex
reviews more manageable.
- A large PR with a description covering multiple disjoint issues may be
better split to independent fixes.

I mention refactoring a lot because those sorts of PRs are frequent
culprits of these large diffs. One example I had recently was refactoring
in https://github.com/apache/ozone/pull/4838 followed by the change
dependent on the refactoring in https://github.com/apache/ozone/pull/4867.
The latter was still a large PRs but splitting helped speed up review and
keep the diff under 1k lines. Made my dev work easier too.

I don't think a large PR should warrant a CI failure either, this seems
like the type of thing that would be up to reviewers to enforce since it
impacts them the most. If you're set to review a large change, ask if there
is any way it can be split to a dependency chain of two or more consecutive
PRs. Having something written in COMTRIBUTING.md will provide a source that
reviewers can point devs towards as a reference in these situations.

Ethan

On Wed, Jul 26, 2023 at 3:02 PM Wei-Chiu Chuang  wrote:

> Hi Ozoners,
>
> In one of the coffee break chats with a colleague of mine, we realized many
> of the PRs in the Ozone project are quite lengthy.
>
> I'm guilty of this myself too. Keeping PR short and sweet is good hygiene.
> It allows reviewers to spot potential problems in the code easier, and your
> PR is more likely to be reviewed and iterated quickly.
>
> How would you like to see the PR quality improved? I'd like to urge
> everyone to break down PRs but I don't necessarily want a GitHub Action
> that enforces length limit. :)
>
> Weichiu
>


Re: [DISCUSS] Please limit PR size

2023-07-26 Thread Ethan Rose
That sounds good to me. We could add a brief reminder there and more
details in the contributing guide if we want.

On Wed, Jul 26, 2023 at 4:31 PM Wei-Chiu Chuang  wrote:

> Actually I'd prefer adding a reminder in the PR template here
>
> https://github.com/apache/ozone/blob/master/.github/pull_request_template.md
> Seems like a good way to soft-enforce the rule.
>
> On Wed, Jul 26, 2023 at 3:53 PM Ethan Rose 
> wrote:
>
> > I've definitely been on both sides: posting a large PR myself, and as a
> > reviewer asking for PRs to be separated into smaller pieces. If we want
> > something persisted beyond an email thread I think we could update
> > CONTRIBUTING.md with a new section about how to post larger changes. Tips
> > for breaking down PRs that we could document:
> > - Separating refactoring PRs from the new change that depends on the
> > refactoring
> > - Splitting one refactoring of a general area of code into individual
> > refactoring PRs to iterate from the current state to the desired state.
> > - Use parent jiras with subtasks as part of planning, before coding.
> > - Provide design docs or at least detailed PR descriptions to make
> complex
> > reviews more manageable.
> > - A large PR with a description covering multiple disjoint issues may be
> > better split to independent fixes.
> >
> > I mention refactoring a lot because those sorts of PRs are frequent
> > culprits of these large diffs. One example I had recently was refactoring
> > in https://github.com/apache/ozone/pull/4838 followed by the change
> > dependent on the refactoring in
> https://github.com/apache/ozone/pull/4867.
> > The latter was still a large PRs but splitting helped speed up review and
> > keep the diff under 1k lines. Made my dev work easier too.
> >
> > I don't think a large PR should warrant a CI failure either, this seems
> > like the type of thing that would be up to reviewers to enforce since it
> > impacts them the most. If you're set to review a large change, ask if
> there
> > is any way it can be split to a dependency chain of two or more
> consecutive
> > PRs. Having something written in COMTRIBUTING.md will provide a source
> that
> > reviewers can point devs towards as a reference in these situations.
> >
> > Ethan
> >
> > On Wed, Jul 26, 2023 at 3:02 PM Wei-Chiu Chuang 
> > wrote:
> >
> > > Hi Ozoners,
> > >
> > > In one of the coffee break chats with a colleague of mine, we realized
> > many
> > > of the PRs in the Ozone project are quite lengthy.
> > >
> > > I'm guilty of this myself too. Keeping PR short and sweet is good
> > hygiene.
> > > It allows reviewers to spot potential problems in the code easier, and
> > your
> > > PR is more likely to be reviewed and iterated quickly.
> > >
> > > How would you like to see the PR quality improved? I'd like to urge
> > > everyone to break down PRs but I don't necessarily want a GitHub Action
> > > that enforces length limit. :)
> > >
> > > Weichiu
> > >
> >
>


Re: [DISCUSS] Next Ozone major release 1.4.0

2023-08-01 Thread Ethan Rose
;>>> >
> >>>> > Warm Regards.
> >>>> >
> >>>> >
> >>>> > --
> >>>> >
> >>>> > Yiyang Zhou
> >>>> >
> >>>> >
> >>>> >
> >>>> > István Fajth  于2023年6月8日周四 16:08写道:
> >>>> >
> >>>> >
> >>>> > > On the certificate rotation front, we are approaching a major
> >>>> milestone,
> >>>> >
> >>>> > > with
> >>>> >
> >>>> > > all services becoming capable of rotating their certificates
> without
> >>>> >
> >>>> > > service disruption including the rootCA certificate as well.
> >>>> >
> >>>> > >
> >>>> >
> >>>> > > We also have resolved a few things around the need for the
> >>>> primordial
> >>>> >
> >>>> > node,
> >>>> >
> >>>> > > we can not get rid of the need of a special node during the first
> >>>> >
> >>>> > bootstrap
> >>>> >
> >>>> > > of
> >>>> >
> >>>> > > the PKI system, but after that the special node is not needed
> >>>> anymore,
> >>>> >
> >>>> > and
> >>>> >
> >>>> > > the
> >>>> >
> >>>> > > leader SCM will be able to initiate the rotation of CA
> certificates
> >>>> we
> >>>> >
> >>>> > have
> >>>> >
> >>>> > > in
> >>>> >
> >>>> > > the system.
> >>>> >
> >>>> > >
> >>>> >
> >>>> > > For us the next big thing is handling the certificate revocation,
> >>>> also to
> >>>> >
> >>>> > > do
> >>>> >
> >>>> > > some further code cleanup and simplification, it would be nice to
> >>>> have it
> >>>> >
> >>>> > > released soon after it is ready, but as we do not have it right
> now
> >>>> >
> >>>> > either
> >>>> >
> >>>> > > we can live through another release without it. It is a bit
> >>>> unrealistic
> >>>> >
> >>>> > to
> >>>> >
> >>>> > > have it included in a 1.5.0 release if it comes out within the
> next
> >>>> 1-2
> >>>> >
> >>>> > > month.
> >>>> >
> >>>> > >
> >>>> >
> >>>> > > But the finished certificate rotation feature is somewhat
> >>>> mandatory, as
> >>>> >
> >>>> > > there
> >>>> >
> >>>> > > are changes in how we store certificates, and even though the
> >>>> startup
> >>>> >
> >>>> > needs
> >>>> >
> >>>> > > to
> >>>> >
> >>>> > > and will handle the old format, we would like to introduce a
> change
> >>>> and
> >>>> >
> >>>> > > transform the metadata directory structure during an upgrade
> >>>> >
> >>>> > finalization.
> >>>> >
> >>>> > >
> >>>> >
> >>>> > > Pifta
> >>>> >
> >>>> > >
> >>>> >
> >>>> > > Ritesh Shukla  ezt írta (időpont:
> >>>> 2023.
> >>>> >
> >>>> > jún.
> >>>> >
> >>>> > > 7., Sze, 18:53):
> >>>> >
> >>>> > >
> >>>> >
> >>>> > > > We can include the block token work that is in the process of
> >>>> being
> >>>> >
> >>>> > > merged.
> >>>> >
> >>>> > > > That work considerably impacts performance, and delaying it to
> >>>> 1.5.0
> >>>> >
> >>>> > will
> >>>> >
> >>>

Re: [DISCUSS] Next Ozone major release 1.4.0

2023-08-08 Thread Ethan Rose
 > > >>
> >> > > > > > >> --
> >> > > > > > >> Yiyang Zhou
> >> > > > > > >>
> >> > > > > > >>
> >> > > > > > >> Janus Chow  于2023年6月28日周三 10:58写道:
> >> > > > > > >>
> >> > > > > > >>> Thanks, Sammi, Siyao.
> >> > > > > > >>>
> >> > > > > > >>> I have created the ticket for the release:
> >> > > > > > >>> https://issues.apache.org/jira/browse/HDDS-8944
> >> > > > > > >>>
> >> > > > > > >>> Feel free to add comments about the features or patches
> that
> >> > need
> >> > > > to
> >> > > > > be
> >> > > > > > >>> added to Ozone 1.4.0. I will wait until all comments are
> >> > > resolved,
> >> > > > > and
> >> > > > > > then
> >> > > > > > >>> create the release branch.
> >> > > > > > >>>
> >> > > > > > >>> Thank you all.
> >> > > > > > >>>
> >> > > > > > >>> Warm Regards.
> >> > > > > > >>>
> >> > > > > > >>> --
> >> > > > > > >>> Yiyang Zhou
> >> > > > > > >>>
> >> > > > > > >>>
> >> > > > > > >>> Siyao Meng  于2023年6月28日周三 05:18写道:
> >> > > > > > >>>
> >> > > > > > >>>> +1 for kicking off the 1.4.0 release process.
> >> > > > > > >>>>
> >> > > > > > >>>> Thanks Yiyang for volunteering to be the RM.
> >> > > > > > >>>>
> >> > > > > > >>>> On the snapshot feature (HDDS-6517
> >> > > > > > >>>> <https://issues.apache.org/jira/browse/HDDS-6517>)
> front,
> >> > > public
> >> > > > > APIs
> >> > > > > > >>>> and
> >> > > > > > >>>> proto messages should now be stable. So it wouldn't
> block a
> >> > > > release
> >> > > > > > with
> >> > > > > > >>>> it
> >> > > > > > >>>> atm.
> >> > > > > > >>>>
> >> > > > > > >>>>
> >> > > > > > >>>> -Siyao
> >> > > > > > >>>>
> >> > > > > > >>>> On Jun 26, 2023 at 9:59:15 PM, Sammi Chen <
> >> > sammic...@apache.org
> >> > > >
> >> > > > > > wrote:
> >> > > > > > >>>>
> >> > > > > > >>>>> Thank you,  Yiyang,  for volunteering the 1.4.0 RM !
> >> > > > > > >>>>>
> >> > > > > > >>>>> Bests,
> >> > > > > > >>>>> Sammi
> >> > > > > > >>>>>
> >> > > > > > >>>>> On Tue, 20 Jun 2023 at 14:50, Janus Chow <
> >> > yiyang0...@gmail.com
> >> > > >
> >> > > > > > wrote:
> >> > > > > > >>>>>
> >> > > > > > >>>>> Hello, Everyone.
> >> > > > > > >>>>>
> >> > > > > > >>>>>
> >> > > > > > >>>>>I'm quite interested in this release work, if the RM
> is
> >> > > still
> >> > > > > > >>>>>
> >> > > > > > >>>>> available, I would like to be the RM of Ozone 1.4.0.
> >> > > > > > >>>>>
> >> > > > > > >>>>>
> >> > > > > > >>>>> Warm Regards.
> >> > > > > > >>>>>
> >> > > > > > >>>>>
> >> > > > > > >>>>> --
> >> > > > > &g

Re: [DISCUSS] Next Ozone major release 1.4.0

2023-08-23 Thread Ethan Rose
Hi Yiyang,

I think you will need to log in to Jira to view the Dashboard. It works for
me only after logging in.

Changing the subject, do you think we would be able to have arm docker
images for 1.4.0? Apple Silicon is becoming more popular for developer
machines and the x86 images are too slow to use there. I build my own arm
images for now but this is not easy for new users who just want to quickly
try out Ozone. Siyao had filed
https://issues.apache.org/jira/browse/HDDS-6263 and right now all the
subtasks are resolved. Siyao, if you could update what gaps remain in this
area, maybe we could see if it is feasible to add arm images for 1.4.0?

Ethan


On Thu, Aug 10, 2023 at 6:36 PM Janus Chow  wrote:

> Hello Pifta,
>
> Seems I don't have access to the dashboard, could you help to grant the
> permissions?
>
> Warm Regards.
>
> --
> Yiyang Zhou
>
>
> István Fajth  于2023年8月10日周四 19:57写道:
>
> > Hi Yiyang,
> >
> > Thank you for putting together the query, I have modified it a bit, and I
> > have created a Dashboard with the modified and one more query.
> >
> > The dashboard should be available for anyone in the ozone group, and
> > editable for the ozone-pmc.
> > The dashboard is here:
> >
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12337118
> >
> > The first list on the Dashboard is listing all open, in progress, patch
> > available and reopened critical and blocker issues targeted to 1.4.0 (and
> > just to be sure we do not leave anything there 1.3.1, and 1.2.2.) ordered
> > by the assignee within that priority.
> > The second list contains all issues that are targeted to 1.4.0 (and 1.3.1
> > and 1.2.2), that are open, in progress, patch available, and reopened
> > ordered by the assignee within that priority.
> >
> > This way, everyone can take a look on his/her own JIRAs, and based on the
> > queries we can do bulk updates as well if needed.
> >
> > Cheers,
> > Pifta
> >
> > Janus Chow  ezt írta (időpont: 2023. aug. 10., Cs,
> > 3:31):
> >
> > > Hello,
> > >
> > > Thanks @Ethan, @Pifta for the update. I agree a dashboard is a
> better
> > > idea to manage the tickets.
> > >
> > > This dashboard URL, It filters the "BLOCKER" and "CRITICAL"
> tickets:
> > >
> > >
> > >
> >
> https://issues.apache.org/jira/browse/HDDS-9046?jql=project%20%3D%20HDDS%20AND%20resolution%20%3D%20Unresolved%20and%20priority%20in%20(Blocker%2CCritical)%20ORDER%20BY%20%20priority%20DESC%2C%20updated%20DESC
> > >
> > > Since some critical tickets haven't been updated for a long time,
> > maybe
> > > we can migrate the important "CRITICALS" to "BLOCKERS", then we can
> focus
> > > on the "BLOCKER"s.
> > >
> > > Warm Regards.
> > >
> > > --
> > > Yiyang Zhou
> > >
> > >
> > > István Fajth  于2023年8月9日周三 20:32写道:
> > >
> > > > Hi Janus,
> > > >
> > > > Thank you for summarizing and organizing these things to help us
> > prepare
> > > > for the release!
> > > > HDDS-7335, and HDDS-7401 are umbrella JIRAs and they are not needed
> to
> > be
> > > > closed in 1.4.0. Their priority was changed for ordering reasons
> only.
> > > >
> > > > On the other hand critical items regarding rotation of CA
> certificates
> > > are
> > > > according to our current knowledge and testing:
> > > > HDDS-9013 - Fetch root CA certificate list during SCM startup
> > > > HDDS-8958 - Handle trust chain changes in clients when rootCAs are
> > > rotated.
> > > > HDDS-8960 - Hold the rootCA's private key only in memory for the time
> > of
> > > > initialization/rotation, then forget it
> > > > HDDS-9138 - Use sequence ID for certificate serial ID
> > > > I can not update these in the sheet, but wanted to give you the
> inputs.
> > > >
> > > > I am joining Ethan in thinking that a JIRA dashboard is probably
> better
> > > to
> > > > have live data, with that we can slowly filter and move out JIRAs
> that
> > > > won't belong to 1.4.0, in the name of this thought, I moved these
> four
> > > JIRA
> > > > items to blocker and target it to 1.4.0 release, I will do similarly
> > with
> > > > anything that comes up critical to 1.4.0 in the future.
> > > >
> > > > Cheers,
> > > > P

Improving the Apache Ozone Website and Documentation

2023-08-28 Thread Ethan Rose
Hello Ozone Community,

As many of you are aware, Ozone's website and documentation has remained
somewhat disorganized and lacking in content for a while. I would like to
kick start an initiative to change that. I have attached a document
outlining a plan to redo Ozone's website and documentation so that it will
not be a barrier for new adopters and can be easily updated to hold all
content we wish to publish. The document outlines a high level plan that
will be broken into multiple parent jiras with subtasks to follow while the
new website is under construction. I would encourage everyone to read it
and give feedback on this initiative. I also presented this content in this
week's U.S time zone community sync.

As part of "Phase 1" in the document, I would like to make the following
proposals:

1. Migrate Ozone's static site generator from Hugo to Docusaurus.

Docusaurus  is an MIT licensed static site
generator maintained by Meta/Facebook open source. While there are many
examples  of sites using Docusaurus,
including the Docusaurus site itself, YuniKorn
 is another Apache project that is
successfully using the framework for their website. I have discussed this
with two Yunikorn PMC members who work on the site and they said the
experience has been positive.

What does Docusaurus provide us that Hugo does not? Hugo's advantage is
that it is written in Go, so in theory the site can be built from just one
easy to install binary.  Docusaurus is written in Javascript, so it needs
to be installed from a package manager like npm. However, if we want to
make the site more robust with things like collapsable menus, minimap side
bar, header linking, search, breadcrumbs, and more, we would probably need
to depend on a Hugo theme rather than build all these things from scratch.
Hugo themes with these features, like Doks , are also
javascript packages that require node to build, but have the following
disadvantages:
- Configuration becomes split between the theme and Hugo itself. It is not
intuitive for new developers whether a config belongs to Hugo or the theme,
or whose documentation to read to make changes.
- The site is now highly dependent on another library that may not get the
same amount of updates or support as the main Hugo project.

Since Docusaurus is built specifically for project documentation, it
provides all these features with its default theme, and configuration
happens in a centralized docusaurus.config.js file. I have tested it and
can confirm that it is just as easy to spin up the site locally and view
edits in real time as Hugo. It also has a built in versioning
 framework that we can use instead
of our manual process of copying over docs from the main Ozone repo, which
brings me to my next proposal.

2. Move our docs out of the ozone repo and into the ozone-site repo.

In theory, having docs tracked and versioned in the same repo as the code
sounds like a good idea, but in practice, we do not use any of the
potential benefits:
- We always use separate PRs to update docs and code, and I think this
actually makes reviews easier.
- The existence of the docs in the same repo as the code does not seem to
"remind" anyone to update the docs if something relevant changes. Zita
suggested at the community sync that adding a reminder to update docs to
our PR template would probably be more effective.
- Relations between code and doc changes can already be created with
existing Jira and Github linking features, even across repos.

This split model does cause us problems:
- It is challenging to automatically publish incremental updates to the
versioned portion of the docs. In order to update the versioned docs, we
need to build the docs within Ozone, and manually copy the built artifacts
into the asf-site branch of the ozone-site repo. This is a confusing
process, since most doc updates to ozone-site are supposed to happen on the
master branch which is automatically built into the asf-site branch.
Because of this, content updates to our docs are rarely published outside
of a release, even if they are committed earlier.
- The most up to date docs are built from the ozone master branch and must
live on a separate site

from ozone.apache.org because there is no integration between the main
ozone repo and the automatic doc publishing in the ozone-site repo.
- We must manually version our documentation instead of letting our site
generator do it for us. This can be error prone. For example, see the
comments on the 1.3.0 doc update PR
.

For these reasons, I propose moving the docs out of the main Ozone repo and
into the ozone-site repo. Work on the improved website and docs can happen
on a

Re: Improving the Apache Ozone Website and Documentation

2023-08-29 Thread Ethan Rose
Thanks for the responses, it seems the PDF document did not attach
correctly. I have re-attached it to this email. If it is still not visible
I will post it on Jira and share the link.

Ethan

On Tue, Aug 29, 2023 at 1:45 AM Zita Dombi  wrote:

> +1, this is a really good idea, I think many people will benefit from it. I
> started working on Ozone freshly graduated, as my first job, so I felt the
> disadvantages of the current website and documentation while learning about
> Ozone. The document you shared on the community sync is very detailed, you
> also mentioned that you attached it in the email above, but I can't find it
> there, maybe I'm missing something :)
>
> thanks,
> Zita
>
> Dinesh Chitlangia  ezt írta (időpont: 2023. aug. 29.,
> K, 4:44):
>
> > +1 , this is a much needed improvement.
> >
> >
> > Thanks,
> > Dinesh
> >
> > On Mon, Aug 28, 2023 at 8:25 PM Ethan Rose 
> > wrote:
> >
> > > Hello Ozone Community,
> > >
> > > As many of you are aware, Ozone's website and documentation has
> remained
> > > somewhat disorganized and lacking in content for a while. I would like
> to
> > > kick start an initiative to change that. I have attached a document
> > > outlining a plan to redo Ozone's website and documentation so that it
> > will
> > > not be a barrier for new adopters and can be easily updated to hold all
> > > content we wish to publish. The document outlines a high level plan
> that
> > > will be broken into multiple parent jiras with subtasks to follow while
> > the
> > > new website is under construction. I would encourage everyone to read
> it
> > > and give feedback on this initiative. I also presented this content in
> > this
> > > week's U.S time zone community sync.
> > >
> > > As part of "Phase 1" in the document, I would like to make the
> following
> > > proposals:
> > >
> > > 1. Migrate Ozone's static site generator from Hugo to Docusaurus.
> > >
> > > Docusaurus <https://docusaurus.io/> is an MIT licensed static site
> > > generator maintained by Meta/Facebook open source. While there are many
> > > examples <https://docusaurus.io/showcase> of sites using Docusaurus,
> > > including the Docusaurus site itself, YuniKorn
> > > <https://yunikorn.apache.org/> is another Apache project that is
> > > successfully using the framework for their website. I have discussed
> this
> > > with two Yunikorn PMC members who work on the site and they said the
> > > experience has been positive.
> > >
> > > What does Docusaurus provide us that Hugo does not? Hugo's advantage is
> > > that it is written in Go, so in theory the site can be built from just
> > one
> > > easy to install binary.  Docusaurus is written in Javascript, so it
> needs
> > > to be installed from a package manager like npm. However, if we want to
> > > make the site more robust with things like collapsable menus, minimap
> > side
> > > bar, header linking, search, breadcrumbs, and more, we would probably
> > need
> > > to depend on a Hugo theme rather than build all these things from
> > scratch.
> > > Hugo themes with these features, like Doks <https://getdoks.org>, are
> > > also javascript packages that require node to build, but have the
> > following
> > > disadvantages:
> > > - Configuration becomes split between the theme and Hugo itself. It is
> > not
> > > intuitive for new developers whether a config belongs to Hugo or the
> > theme,
> > > or whose documentation to read to make changes.
> > > - The site is now highly dependent on another library that may not get
> > the
> > > same amount of updates or support as the main Hugo project.
> > >
> > > Since Docusaurus is built specifically for project documentation, it
> > > provides all these features with its default theme, and configuration
> > > happens in a centralized docusaurus.config.js file. I have tested it
> and
> > > can confirm that it is just as easy to spin up the site locally and
> view
> > > edits in real time as Hugo. It also has a built in versioning
> > > <https://docusaurus.io/docs/versioning> framework that we can use
> > instead
> > > of our manual process of copying over docs from the main Ozone repo,
> > which
> > > brings me to my next proposal.
> > >
> > > 2. Move our docs out of the ozone repo and into the

Re: Improving the Apache Ozone Website and Documentation

2023-08-29 Thread Ethan Rose
I cannot seem to send the doc on this email thread, sorry for the
confusion. I have instead filed HDDS-9225
<https://issues.apache.org/jira/browse/HDDS-9225> as a parent Jira for this
task and attached the doc there. The link is here
<https://issues.apache.org/jira/secure/attachment/13062569/Improving%20the%20Apache%20Ozone%20Website.pdf>
as well. The proposal includes details for improving both the site layout
and its content.

Ethan

On Tue, Aug 29, 2023 at 11:17 AM Ethan Rose  wrote:

> Thanks for the responses, it seems the PDF document did not attach
> correctly. I have re-attached it to this email. If it is still not visible
> I will post it on Jira and share the link.
>
> Ethan
>
> On Tue, Aug 29, 2023 at 1:45 AM Zita Dombi  wrote:
>
>> +1, this is a really good idea, I think many people will benefit from it.
>> I
>> started working on Ozone freshly graduated, as my first job, so I felt the
>> disadvantages of the current website and documentation while learning
>> about
>> Ozone. The document you shared on the community sync is very detailed, you
>> also mentioned that you attached it in the email above, but I can't find
>> it
>> there, maybe I'm missing something :)
>>
>> thanks,
>> Zita
>>
>> Dinesh Chitlangia  ezt írta (időpont: 2023. aug. 29.,
>> K, 4:44):
>>
>> > +1 , this is a much needed improvement.
>> >
>> >
>> > Thanks,
>> > Dinesh
>> >
>> > On Mon, Aug 28, 2023 at 8:25 PM Ethan Rose 
>> > wrote:
>> >
>> > > Hello Ozone Community,
>> > >
>> > > As many of you are aware, Ozone's website and documentation has
>> remained
>> > > somewhat disorganized and lacking in content for a while. I would
>> like to
>> > > kick start an initiative to change that. I have attached a document
>> > > outlining a plan to redo Ozone's website and documentation so that it
>> > will
>> > > not be a barrier for new adopters and can be easily updated to hold
>> all
>> > > content we wish to publish. The document outlines a high level plan
>> that
>> > > will be broken into multiple parent jiras with subtasks to follow
>> while
>> > the
>> > > new website is under construction. I would encourage everyone to read
>> it
>> > > and give feedback on this initiative. I also presented this content in
>> > this
>> > > week's U.S time zone community sync.
>> > >
>> > > As part of "Phase 1" in the document, I would like to make the
>> following
>> > > proposals:
>> > >
>> > > 1. Migrate Ozone's static site generator from Hugo to Docusaurus.
>> > >
>> > > Docusaurus <https://docusaurus.io/> is an MIT licensed static site
>> > > generator maintained by Meta/Facebook open source. While there are
>> many
>> > > examples <https://docusaurus.io/showcase> of sites using Docusaurus,
>> > > including the Docusaurus site itself, YuniKorn
>> > > <https://yunikorn.apache.org/> is another Apache project that is
>> > > successfully using the framework for their website. I have discussed
>> this
>> > > with two Yunikorn PMC members who work on the site and they said the
>> > > experience has been positive.
>> > >
>> > > What does Docusaurus provide us that Hugo does not? Hugo's advantage
>> is
>> > > that it is written in Go, so in theory the site can be built from just
>> > one
>> > > easy to install binary.  Docusaurus is written in Javascript, so it
>> needs
>> > > to be installed from a package manager like npm. However, if we want
>> to
>> > > make the site more robust with things like collapsable menus, minimap
>> > side
>> > > bar, header linking, search, breadcrumbs, and more, we would probably
>> > need
>> > > to depend on a Hugo theme rather than build all these things from
>> > scratch.
>> > > Hugo themes with these features, like Doks <https://getdoks.org>, are
>> > > also javascript packages that require node to build, but have the
>> > following
>> > > disadvantages:
>> > > - Configuration becomes split between the theme and Hugo itself. It is
>> > not
>> > > intuitive for new developers whether a config belongs to Hugo or the
>> > theme,
>> > > or whose documentation to read to make changes.
>> > > - Th

Re: Improving the Apache Ozone Website and Documentation

2023-08-30 Thread Ethan Rose
Yes Algolia integration for search would be something we would want to add
as part of this effort. Yunikorn has it, and Spark
<https://spark.apache.org/docs/latest/index.html> has it as well, although
they are using Jekyll. Either way, it looks like Apache projects are able
to leverage Algolia and we should take advantage of this.

On Wed, Aug 30, 2023 at 7:44 AM Arpit Agarwal 
wrote:

>  Thank you for kicking off this much needed initiative Ethan. I’m not up to
> date on all the site generation frameworks however Docusaurus looks pretty
> good me. Also one major omission in our current docs website is search
> capability. I see the Yunikorn website you linked is using Algolia, it
> would be great to integrate that or something equivalent.
>
>
>
> On Aug 28, 2023 at 5:25:16 PM, Ethan Rose 
> wrote:
>
> > Hello Ozone Community,
> >
> > As many of you are aware, Ozone's website and documentation has remained
> > somewhat disorganized and lacking in content for a while. I would like to
> > kick start an initiative to change that. I have attached a document
> > outlining a plan to redo Ozone's website and documentation so that it
> will
> > not be a barrier for new adopters and can be easily updated to hold all
> > content we wish to publish. The document outlines a high level plan that
> > will be broken into multiple parent jiras with subtasks to follow while
> the
> > new website is under construction. I would encourage everyone to read it
> > and give feedback on this initiative. I also presented this content in
> this
> > week's U.S time zone community sync.
> >
> > As part of "Phase 1" in the document, I would like to make the following
> > proposals:
> >
> > 1. Migrate Ozone's static site generator from Hugo to Docusaurus.
> >
> > Docusaurus <https://docusaurus.io/> is an MIT licensed static site
> > generator maintained by Meta/Facebook open source. While there are many
> > examples <https://docusaurus.io/showcase> of sites using Docusaurus,
> > including the Docusaurus site itself, YuniKorn
> > <https://yunikorn.apache.org/> is another Apache project that is
> > successfully using the framework for their website. I have discussed this
> > with two Yunikorn PMC members who work on the site and they said the
> > experience has been positive.
> >
> > What does Docusaurus provide us that Hugo does not? Hugo's advantage is
> > that it is written in Go, so in theory the site can be built from just
> one
> > easy to install binary.  Docusaurus is written in Javascript, so it needs
> > to be installed from a package manager like npm. However, if we want to
> > make the site more robust with things like collapsable menus, minimap
> side
> > bar, header linking, search, breadcrumbs, and more, we would probably
> need
> > to depend on a Hugo theme rather than build all these things from
> scratch.
> > Hugo themes with these features, like Doks <https://getdoks.org>, are
> > also javascript packages that require node to build, but have the
> following
> > disadvantages:
> > - Configuration becomes split between the theme and Hugo itself. It is
> not
> > intuitive for new developers whether a config belongs to Hugo or the
> theme,
> > or whose documentation to read to make changes.
> > - The site is now highly dependent on another library that may not get
> the
> > same amount of updates or support as the main Hugo project.
> >
> > Since Docusaurus is built specifically for project documentation, it
> > provides all these features with its default theme, and configuration
> > happens in a centralized docusaurus.config.js file. I have tested it and
> > can confirm that it is just as easy to spin up the site locally and view
> > edits in real time as Hugo. It also has a built in versioning
> > <https://docusaurus.io/docs/versioning> framework that we can use
> instead
> > of our manual process of copying over docs from the main Ozone repo,
> which
> > brings me to my next proposal.
> >
> > 2. Move our docs out of the ozone repo and into the ozone-site repo.
> >
> > In theory, having docs tracked and versioned in the same repo as the code
> > sounds like a good idea, but in practice, we do not use any of the
> > potential benefits:
> > - We always use separate PRs to update docs and code, and I think this
> > actually makes reviews easier.
> > - The existence of the docs in the same repo as the code does not seem to
> > "remind" anyone to update the docs if something relevant changes. Zita
> > sugge

Re: Improving the Apache Ozone Website and Documentation

2023-11-02 Thread Ethan Rose
Hi Ozone devs,

It has been a while since I started this thread and got carried off into
other issues. However, I have recently been able to shift focus back to
this improvement effort. The overall website improvement is tracked under
HDDS-9225, which is split into multiple child tasks.  I have begun defining
work items under those tasks and started implementation, but I would
appreciate any help in this area! All PRs are going to the
HDDS-9225-website-v2
<https://github.com/apache/ozone-site/tree/HDDS-9225-website-v2> branch in
the ozone-site repo, and can be tagged with the website-v2
<https://github.com/apache/ozone-site/pulls?q=label%3Awebsite-v2+> Github
label.

Here are the items we are ready to begin work on:

- HDDS-9538 <https://issues.apache.org/jira/browse/HDDS-9538>. Subtasks of
this Jira are for setting up the Docusuarus framework.

I have two PRs out now to get an initial website template committed:
https://github.com/apache/ozone-site/pull/45
https://github.com/apache/ozone-site/pull/46

Once these are merged, the remaining subtasks of the framework parent Jira
can be completed in any order. Feel free to file Jiras under this task if
there are framework related tasks I have missed.

- HDDS-9601 <https://issues.apache.org/jira/browse/HDDS-9601>. Subtasks of
this Jira are for Github integration with Docusaurus. These will mostly be
small Github actions related patches if anyone who has worked on the main
Ozone CI (or who wants to learn GH actions) is interested in contributing
to the new website.

- HDDS-9539 <https://issues.apache.org/jira/browse/HDDS-9539>. Subtasks of
this Jira are for creating the website homepage.

Help finding/generating images would be appreciated. This may be easier
once we have the homepage content better defined.

- HDDS-9613 <https://issues.apache.org/jira/browse/HDDS-9613>
(documentation) and HDDS-9614
<https://issues.apache.org/jira/browse/HDDS-9614>(community)

We are not yet ready to start writing these pages, but I would appreciate
discussion on the Jiras about the layout and content of pages we should
include in these sections.

Thanks for your support of this effort, and please let me know if you have
any questions or concerns about the project.

Ethan

On Wed, Aug 30, 2023 at 9:59 AM Wei-Chiu Chuang  wrote:

> In addition to layout revamp, one thing I'd love to see is more user docs.
> Our doc is by and large written for core Ozone developers. We need to have
> more content around how to run workarounds, how to migrate existing
> applications from HDFS to Ozone, API guide ... etc.
>
> On Mon, Aug 28, 2023 at 5:25 PM Ethan Rose 
> wrote:
>
> > Hello Ozone Community,
> >
> > As many of you are aware, Ozone's website and documentation has remained
> > somewhat disorganized and lacking in content for a while. I would like to
> > kick start an initiative to change that. I have attached a document
> > outlining a plan to redo Ozone's website and documentation so that it
> will
> > not be a barrier for new adopters and can be easily updated to hold all
> > content we wish to publish. The document outlines a high level plan that
> > will be broken into multiple parent jiras with subtasks to follow while
> the
> > new website is under construction. I would encourage everyone to read it
> > and give feedback on this initiative. I also presented this content in
> this
> > week's U.S time zone community sync.
> >
> > As part of "Phase 1" in the document, I would like to make the following
> > proposals:
> >
> > 1. Migrate Ozone's static site generator from Hugo to Docusaurus.
> >
> > Docusaurus <https://docusaurus.io/> is an MIT licensed static site
> > generator maintained by Meta/Facebook open source. While there are many
> > examples <https://docusaurus.io/showcase> of sites using Docusaurus,
> > including the Docusaurus site itself, YuniKorn
> > <https://yunikorn.apache.org/> is another Apache project that is
> > successfully using the framework for their website. I have discussed this
> > with two Yunikorn PMC members who work on the site and they said the
> > experience has been positive.
> >
> > What does Docusaurus provide us that Hugo does not? Hugo's advantage is
> > that it is written in Go, so in theory the site can be built from just
> one
> > easy to install binary.  Docusaurus is written in Javascript, so it needs
> > to be installed from a package manager like npm. However, if we want to
> > make the site more robust with things like collapsable menus, minimap
> side
> > bar, header linking, search, breadcrumbs, and more, we would probably
> need
> > to depend on a Hugo theme rather than build all these thing

Re: [Announce] Apache Ozone 1.4.0 Release

2024-01-22 Thread Ethan Rose
Thanks Yiyang for managing the release and to all who contributed.

Ethan

On Mon, Jan 22, 2024 at 12:25 AM Attila Doroszlai 
wrote:

> Thanks Yiyang for driving the release to completion.
>
> -Attila
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> For additional commands, e-mail: dev-h...@ozone.apache.org
>
>


Re: Improving the Apache Ozone Website and Documentation

2024-01-24 Thread Ethan Rose
Hi all, I just wanted to provide some updates on the new website. I will
try to send out these update emails more regularly in this thread to
summarize progress.

- We have a staging website deployed at
https://ozone-site-v2.staged.apache.org. This is also linked from the main
website Jira HDDS-9225. This is updated with each commit to the feature
branch and is a good way to check the current progress on the site, as well
as test it on mobile and multiple browsers.
- We have a detailed contributing guide
<https://github.com/apache/ozone-site/blob/HDDS-9225-website-v2/CONTRIBUTING.md>
added to the new website feature branch. Most changes will just be
markdown, but this doc has some more docusaurus specific details for those
who are interested.
- We have a proposed outline of sections and pages for the docs. This can
be viewed on the staging website and is subject to change as we create
documentation. This initial template is to help organize contributions as
we begin writing.
  - There are currently 200 pages on the website that need to be written.
We will inevitably end up with more than this as existing pages in the
outline are split to subpages or missing pages are added. Right now each
page links to a parent Jira where a subtask can be created to fill it in.
We are investigating some automation to potentially pre-create Jiras for
these pages to make sure two people are not accidentally working on the
same page.

At this point the framework is complete enough that we can accept
contributions to pages in the new website, and we are looking for
volunteers who would like to help us with this project.

Thanks,

Ethan

On Thu, Nov 2, 2023 at 6:32 PM Ethan Rose  wrote:

> Hi Ozone devs,
>
> It has been a while since I started this thread and got carried off into
> other issues. However, I have recently been able to shift focus back to
> this improvement effort. The overall website improvement is tracked under
> HDDS-9225, which is split into multiple child tasks.  I have begun defining
> work items under those tasks and started implementation, but I would
> appreciate any help in this area! All PRs are going to the
> HDDS-9225-website-v2
> <https://github.com/apache/ozone-site/tree/HDDS-9225-website-v2> branch
> in the ozone-site repo, and can be tagged with the website-v2
> <https://github.com/apache/ozone-site/pulls?q=label%3Awebsite-v2+> Github
> label.
>
> Here are the items we are ready to begin work on:
>
> - HDDS-9538 <https://issues.apache.org/jira/browse/HDDS-9538>. Subtasks
> of this Jira are for setting up the Docusuarus framework.
>
> I have two PRs out now to get an initial website template committed:
> https://github.com/apache/ozone-site/pull/45
> https://github.com/apache/ozone-site/pull/46
>
> Once these are merged, the remaining subtasks of the framework parent Jira
> can be completed in any order. Feel free to file Jiras under this task if
> there are framework related tasks I have missed.
>
> - HDDS-9601 <https://issues.apache.org/jira/browse/HDDS-9601>. Subtasks
> of this Jira are for Github integration with Docusaurus. These will mostly
> be small Github actions related patches if anyone who has worked on the
> main Ozone CI (or who wants to learn GH actions) is interested in
> contributing to the new website.
>
> - HDDS-9539 <https://issues.apache.org/jira/browse/HDDS-9539>. Subtasks
> of this Jira are for creating the website homepage.
>
> Help finding/generating images would be appreciated. This may be easier
> once we have the homepage content better defined.
>
> - HDDS-9613 <https://issues.apache.org/jira/browse/HDDS-9613>
> (documentation) and HDDS-9614
> <https://issues.apache.org/jira/browse/HDDS-9614>(community)
>
> We are not yet ready to start writing these pages, but I would appreciate
> discussion on the Jiras about the layout and content of pages we should
> include in these sections.
>
> Thanks for your support of this effort, and please let me know if you have
> any questions or concerns about the project.
>
> Ethan
>
> On Wed, Aug 30, 2023 at 9:59 AM Wei-Chiu Chuang 
> wrote:
>
>> In addition to layout revamp, one thing I'd love to see is more user docs.
>> Our doc is by and large written for core Ozone developers. We need to have
>> more content around how to run workarounds, how to migrate existing
>> applications from HDFS to Ozone, API guide ... etc.
>>
>> On Mon, Aug 28, 2023 at 5:25 PM Ethan Rose 
>> wrote:
>>
>> > Hello Ozone Community,
>> >
>> > As many of you are aware, Ozone's website and documentation has remained
>> > somewhat disorganized and lacking in content for a while. I would like
>> to
>> > kick start an initiative to change that. I h

Ozone Storage Container Reconciliation

2024-01-30 Thread Ethan Rose
Hi Ozone Devs,

I wanted to let everyone know that we have a proposal for a new container
reconciliation feature to help us increase container durability and move
beyond the complexity of unhealthy and quasi-closed states that the system
currently cannot recover from. We discussed this at the US commuity sync
this week, and the design document is posted at
https://github.com/apache/ozone/pull/6121. Please add
comments/questions/feedback inline on the pull request. It will be easier
to track and resolve comments there than comments added to this mail thread.

Thanks,
Ethan


Re: RFC: Deprecate Slack and move to GitHub Discussions

2024-01-31 Thread Ethan Rose
Thanks for bringing up this proposal Ritesh. I am +1 for using GitHub
Discussions for user/operational questions and indicating that on our Slack
channel.

I am also +1 for design discussions happening as PRs to markdown documents
as we are doing for container reconciliation. Inline comments make it
easier to track and resolve threads instead of one global conversation
chain for the entire feature that inevitably gets sidetracked and people’s
points not addressed. Jira discussions for design proposals also lack
threading and have this same problem. An email to dev@ pointing to the
design proposal PR would still be good since not everyone may be watching
every new PR that comes in.

I agree with Pifta’s comment that GitHub discussions are easier to find and
use than the users@ mailing list for most people. I think our users share
this opinion since both options are available but people only choose to
submit questions to GitHub discussions. Favoring the more popular option
seems better for community engagement.

For the dev@ list, I think high level updates like releases, new feature
proposals, feature branch merges, or major project milestones make sense
there, since it gives people insight as to which Jiras or PRs to follow.
Then people can subscribe to the threads they want on Jira or GitHub
without the entire dev base being blasted for niche feature specific issues
(of which we have many). I will admit I’ve been an exception to this by
sending dev@ with periodic updates about the new website, but I think at
this stage the updates are still “Project Wide” and the new website is
something that should remain on everyone’s radar.

Speaking of the website, once we decide which communication channels are
favored for what purpose we can fill in the communication channels

page on the new site to indicate this. We can also revisit our preferred
method(s) for design proposals

and update this documentation on the new site’s Developer Guide as well.

On Wed, Jan 31, 2024 at 9:57 AM Ritesh Shukla 
wrote:

> Hi Pifta,
>
> Thank you for chiming in.
>
> I have actually not considered how to revive the mailing list for
> technical and design discussions.
> I am not sure why mailing lists are no longer as popular, in general I see
> a decline of
> using emails for discussions other than formal one way communication.
>
> I agree that the mailing list has more control over the data shared vs.
> hosting it on Github.
>
> Github Discussions do have lower friction than switching to email to
> discuss.
>
> For design my personal preference is markdown and PRs
> Example: https://github.com/apache/ozone/pull/6121
> This is something I insisted we do for the container reconciliation design.
>
> One other aspect I wanted to bring up, there are many new contributors from
> China and China has its own eco system of messaging and chat apps
> where the community interacts. I would like to hear from them as well
> for a suitable approach to keep discussions out in the open
> and not  behind closed walls of apps such as Slack or Weibo.
>
> In the note we leave in Slack and other documentation we can encourage
> both mailing lists
> as well as Github Discussions.
>
> Regards,
> Ritesh
>
>
> > On Jan 31, 2024, at 9:31 AM, István Fajth  wrote:
> >
> > Hi Ritesh,
> >
> > thank you for the proposal, and taking action during the last community
> > sync.
> > I support the idea of using GitHub Discussions, but I would like to bring
> > up one thing with GH Discussions. I am not a fan of Slack either and I
> > agree Slack is far from optimal and is hard to search.
> >
> > It did not happen with Ozone, but as I remember there were projects that
> > have been migrated to github and before that they were running in a
> > different system, this makes me skeptical, as it might happen in the
> future
> > also.
> > So I would like to add one more thing to the table, mailing lists will
> > remain with us (as they are within the control of the foundation), and
> are
> > also indexed by search engines.
> >
> > In my opinion to move the operational problems and questions to github
> > discussions is definitely ok, as I see the value of having a dedicated
> > place where user centric questions are collected and discussed.
> > In the meantime I think we should promote the dev@ list for technical
> > discussions that are about design, code, or for any other topic that
> > really is affecting mainly the developer community.
> >
> > Looking at our @dev mailing list, it is pretty silent recently even
> though
> > we are working on stabilization, performance, and features. (I myself am
> > also guilty and silent, I realized and would like to change that, instead
> > of just doing things within JIRA and PRs.)
> >
> > So I am absolutely +1 for using github discussions in favor of Slack for
> > user/operational probl

Re: RFC: Deprecate Slack and move to GitHub Discussions

2024-02-02 Thread Ethan Rose
is way, leave slack, move user/operational questions
> > to Github Discussions and design to PRs with a notification to the dev@
> > list, I am +1 for this approach, but I think we are still open to
> > further discussion and ideas before we close this thread and take action,
> > so I encourage others to chime in ;)
> >
> > Cheers,
> > Pifta
> >
> > Ethan Rose  ezt írta (időpont: 2024. febr. 1., Cs,
> > 0:29):
> >
> > > Thanks for bringing up this proposal Ritesh. I am +1 for using GitHub
> > > Discussions for user/operational questions and indicating that on our
> > Slack
> > > channel.
> > >
> > > I am also +1 for design discussions happening as PRs to markdown
> > documents
> > > as we are doing for container reconciliation. Inline comments make it
> > > easier to track and resolve threads instead of one global conversation
> > > chain for the entire feature that inevitably gets sidetracked and
> > people’s
> > > points not addressed. Jira discussions for design proposals also lack
> > > threading and have this same problem. An email to dev@ pointing to the
> > > design proposal PR would still be good since not everyone may be
> watching
> > > every new PR that comes in.
> > >
> > > I agree with Pifta’s comment that GitHub discussions are easier to find
> > and
> > > use than the users@ mailing list for most people. I think our users
> > share
> > > this opinion since both options are available but people only choose to
> > > submit questions to GitHub discussions. Favoring the more popular
> option
> > > seems better for community engagement.
> > >
> > > For the dev@ list, I think high level updates like releases, new
> feature
> > > proposals, feature branch merges, or major project milestones make
> sense
> > > there, since it gives people insight as to which Jiras or PRs to
> follow.
> > > Then people can subscribe to the threads they want on Jira or GitHub
> > > without the entire dev base being blasted for niche feature specific
> > issues
> > > (of which we have many). I will admit I’ve been an exception to this by
> > > sending dev@ with periodic updates about the new website, but I think
> at
> > > this stage the updates are still “Project Wide” and the new website is
> > > something that should remain on everyone’s radar.
> > >
> > > Speaking of the website, once we decide which communication channels
> are
> > > favored for what purpose we can fill in the communication channels
> > > <
> > https://ozone-site-v2.staged.apache.org/community/communication-channels
> >
> > > page on the new site to indicate this. We can also revisit our
> preferred
> > > method(s) for design proposals
> > > <
> > >
> >
> https://ozone.apache.org/docs/1.4.0/design/ozone-enhancement-proposals.html
> > > >
> > > and update this documentation on the new site’s Developer Guide as
> well.
> > >
> > > On Wed, Jan 31, 2024 at 9:57 AM Ritesh Shukla
> >  > > >
> > > wrote:
> > >
> > > > Hi Pifta,
> > > >
> > > > Thank you for chiming in.
> > > >
> > > > I have actually not considered how to revive the mailing list for
> > > > technical and design discussions.
> > > > I am not sure why mailing lists are no longer as popular, in general
> I
> > > see
> > > > a decline of
> > > > using emails for discussions other than formal one way communication.
> > > >
> > > > I agree that the mailing list has more control over the data shared
> vs.
> > > > hosting it on Github.
> > > >
> > > > Github Discussions do have lower friction than switching to email to
> > > > discuss.
> > > >
> > > > For design my personal preference is markdown and PRs
> > > > Example: https://github.com/apache/ozone/pull/6121
> > > > This is something I insisted we do for the container reconciliation
> > > design.
> > > >
> > > > One other aspect I wanted to bring up, there are many new
> contributors
> > > from
> > > > China and China has its own eco system of messaging and chat apps
> > > > where the community interacts. I would like to hear from them as well
> > > > for a suitable approach to keep discussions out in the open
> > > > and not  behind closed walls of apps suc

Re: Ozone Community Sync Notes - 29th Jan, 9AM Monday (PST)

2024-02-02 Thread Ethan Rose
Here is a Github discussion thread for standardizing the structure of our
community syncs, as discussed at our last community sync. Please chime in!

https://github.com/apache/ozone/discussions/6157

Ethan

On Tue, Jan 30, 2024 at 2:31 PM Uma Maheswara Rao Gangumalla <
umamah...@apache.org> wrote:

> Dear Ozone Devs,
>
> Please find the US time zone community sync notes below.
>
> Agenda Topics Discussed:
>
>1.
>
>*Release planning:  1.4.1 release? I think we have some snapshot
>issues(any other critical?) that did not land in 1.4.0.* *Discussion
>Notes:*
>1.
>
>   Snapshot issues
>   2.
>
>   Zero Copy changes
>   3.
>
>   Porting RM manager guide from Wiki to website and will be up for
>   review.
>   4.
>
>   We may need Ratis' release.
>   5.
>
>   Is it appropriate to include Zero Copy changes in . release.
>   6.
>
>   Please add your target version if you would like to include changes
>   in the release.
>
>
>
>1.
>
>*Any thoughts on how we can get the users question and answers from
>slack channels to mailing lists for better indexing (for search).*
> *Discussion
>Notes:*
>
>
>1.
>
>Ritesh will be sending an email with the proposal to start using github
>discussions / mailing lists. Slack can be used for any immediate help
> and
>pinging some to help in review etc.
>
>
>
>1.
>
>*Community meeting rotation* *Discussion Notes:*
>
>
>1.
>
>A topic to discuss how we rotate
>1.
>
>   Get signups who would be interested to run meeting, then just follow
>   based on list order in wiki
>   2.
>
>   Identify next week’s community sync champion
>
>
>
>1.
>
>*New contributors PR queue.* *Discussion Notes:*
>
>We quickly looked at the PR queue. And discussed about
> dependabot pr.
>
>Attila pinging specific folks for review help.
>
>
>
>1.
>
> *Any new design proposals ?*
>
> * Discussion Notes:*
>
> Ethan presented a Container reconciliation proposal.
>
> There was some good Q&A.
>
> We would encourage everyone to participate and make it more interactive
> than slide presentations.
>
> Thank you Ethan Rose  and Ritesh Shukla
> 
>
>
>
>1.
>
>*Docs update?* *Discussion Notes:*
>
> Did not get time to discuss.
>
>
> Another thought shared by Ritesh regarding meeting summaries:
> One quick thought was to look at APAC sync summaries and bring up if there
> are any interesting notes there.
>
> *Plan of Actions from the meeting:*1. Ritesh to start a discussion thread
> about #2 for moving technical discussions from slack to github discussions.
> 2. Duog to start a discussion thread about 1.4.1 release (#1)
> 3. Ethan to start a github discussion for next community sync champion (
> #3)
> 4. Ethan to share PR with Container reconciliation design.
>
>
> Regards,
> Uma
>


[VOTE] Change the default branch for ozone-site from asf-site to master

2024-02-08 Thread Ethan Rose
Hi Ozone devs,

I’d like to start a vote thread to change the default branch in the
apache/ozone-site  repo from asf-site
to master. Changing the default branch requires an Infra ticket and mailing
thread according to the asfyaml README
.
I’ll start with some questions you may have when deciding to vote:

*Does this have anything to do with the new website development that is
happening on the feature branch HDDS-9225-website-v2
?*

No, this has nothing to do with the new website. The change would be
effective for the existing website only since it concerns the asf-site and
master branches, neither of which the new website uses right now.

*What is the difference between asf-site and master?*

The master branch contains the code that we modify and commit to change the
website. The asf-site branch contains the already built website. The
contents of asf-site are automatically generated from master and committed
by a GitHub Action
.
>From there, existing ASF services read the .asf.yml
 file in the
asf-site branch and copy the built contents from that branch to wherever
the ASF is hosting the static sites for projects.

*Why should we change the default branch from asf-site to master?*

   1. (My primary motivation) Pull request templates only work if they
are committed
   to the default branch
   

   .
   Committing the PR template from HDDS-10267
    to the asf-site
   branch would be clunky and difficult to modify. It is better to leave that
   branch for auto generated content only. That PR template currently does not
   work since it is not on the asf-site (current default) branch.
   2. It’s confusing for users who go to the site on GitHub or clone the
   repo and expect to see the code they should modify to change the site.
   Instead they have to find the branch that actually has the code that the
   asf-site build content came from.
   3. (Minor) PRs default to using the default branch. When filing a PR for
   the website, GitHub suggests using asf-site first, which gives a message
   stating that the changes cannot be merged since there is no common history.

*Why is our current default asf-site?*

I’m not sure, maybe someone in the community has historical context on
this. It could be because this is the branch that pre-built docs are
committed to when we copy them from the main Ozone repo (a practice we are
looking to get rid of in the new website). It also seems there were
some changes
to branch publishing made around May 2021

so perhaps it was required to be this way for publishing before those
updates.

*Is there any standard among other ASF projects for which branch should be
the default?*

I’ve looked at a bunch of other project’s websites and have yet to find one
that’s using asf-site as the default. They are all using the development
branch (equivalent to our master branch) as the default branch. See

   - https://github.com/apache/yunikorn-site
   - https://github.com/apache/streampipes-website
   - https://github.com/apache/kvrocks-website
   - https://github.com/apache/pulsar-site
   - https://github.com/apache/doris-website
   - https://github.com/apache/rocketmq-site

*Will this affect the existing website?*

This should not affect the existing website. The branch to use for
deployment is hardcoded in .asf.yml
 and not
implied from the repository’s default branch setting. Deployment should
work as usual. I will double check with infra on the ticket to make sure no
changes are required when making this change.

Overall a long winded email for a pretty simple change. I’ll start with my
+1 with the hope of incrementally improving the development experience of
the current site, and in the future, the new website as well.

Ethan


Re: Supported releases

2024-02-13 Thread Ethan Rose
While I would also be surprised if people are successfully running an
unpatched version of 1.3.0, it seems like a circular dependency to decide
which versions to support based on what people are using. People will
generally prefer to use supported versions so our messaging here informs
their decision.

I think the question to ask (which Pifta implied with the options
presented) is how do we define support? Is it just security fixes, or also
critical bug fixes (data loss or service outage) or even other more minor
bug fixes? Given the speed that Ozone development and fixes are happening,
we saw a comprehensive 1.3.1 bug fix release to be too large and that’s why
the effort was abandoned. While a critical bug fix release might be more
manageable, I haven’t seen an effort to initiate this in practice. Based on
this experience, these seem like the two best options to me:

   1. Define support as “security fixes only” and include support for 1.3.0
   2. Define support as “security and critical bug fixes only” and drop
   support for 1.3.0

I am in favor of option 1 because I think the optics of completely dropping
support for the immediately previous version are not good. Also due to the
velocity that bug fixes are coming in, I don’t see us cherry-picking bug
fixes to do a 1.4.1 release, but instead doing 1.5.0 from master. By
defining support as “security only” we are being up-front with our
intentions and upgrade recommendations, since as Stephen highlighted
remaining on 1.3.0 is not recommended for stability.

With either option, more frequent releases will help since we could support
the last X versions, but those versions could still be less than a year
old. Higher release frequency is a topic for another thread though.

Ethan

On Mon, Feb 12, 2024 at 3:00 AM Stephen O'Donnell
 wrote:

> The first question I have is:
>
> Is anyone running 1.3.0 in production today?
>
> That release is over a year old and there are many known problems that have
> been fixed on 1.4.0. I would be quite surprised if anyone is running 1.3.0
> successfully without some additional custom patches on top of it. If they
> have a patched 1.3.0, then a new release won't help, as it would be easier
> to just cherry pick the CVE fix on top of their custom build.
>
>
> On Mon, Feb 12, 2024 at 10:39 AM István Fajth  wrote:
>
> > Hi developers,
> >
> > Me and Attila had a discussion about a PR
> >  that is posted by
> @ivandika3.
> >
> > In our SECURITY.md file we have a table about our supported versions.
> > In light of the recent CVE, the idea is to remove the supported flag from
> > any release prior to 1.3.0, and leave 1.3.0 and 1.4.0 supported.
> >
> > I tend to believe this is something that is reasonable considering that
> > 1.2.1 is over 2 years old.
> >
> > However, leaving 1.3.0 supported means that we should release 1.3.1 in
> > order to have the recently released CVE-2023-39196 fixed for the 1.3.x
> > line.
> > There was an effort earlier to release 1.3.1, but we have abandoned that
> > effort.
> >
> > *What do you think, which way should we manage this?*
> > 1. Leave 1.3.1 branch as it is cherry-pick the CVE fix, and release it?
> > 2. Leave 1.3.1 alone, and release the fix for the CVE from a new branch
> as
> > (a) 1.3.0.1, or (b)1.3.2?
> > 3. Prepare a proper 1.3.1 release with all the critical fixes that we
> have,
> > including the fix for the CVE?
> > 4. Mark 1.3.0 unsupported which is also something that I can imagine,
> > however that release is just over 1 year old.
> >
> > A minimalistic approach - As we dropped the idea of 1.3.1 already - to
> just
> > go with 2a.
> > The simplest is to choose to drop support for 1.3.0.
> > If we support 1.3.0 today, and we want to do it right, then releasing
> 1.3.1
> > in the next 1-2 weeks seems to be the proper approach.
> >
> > I am +1 for all these 3 approaches and I am not against the approach
> listed
> > in point 1., however I do not support it either.
> > I can not - at the moment - take on a full 1.3.1 release, so if the
> > community decides to go that route, we will need a release manager also
> for
> > that.
> >
> > What I can volunteer for, is to update the SECURITY.md based on the
> > decision, and in case a minimalistic release approach is taken (2a or
> 2b),
> > I can take on releasing that.
> >
> > --
> > Pifta
> >
>


Re: [VOTE] Change the default branch for ozone-site from asf-site to master

2024-02-14 Thread Ethan Rose
Does changing default to master still need some generation and commit to
master?

Hi Sumit. The process to generate the website from master and commit it to
asf-site will not be affected by this change. The Github workflow copies
the build from master
<https://github.com/apache/ozone-site/blob/2a519d63500e52b8ebeb20ebe4fb88afaea8c96b/.github/workflows/regenerate.yml#L19>
to asf-site
<https://github.com/apache/ozone-site/blob/2a519d63500e52b8ebeb20ebe4fb88afaea8c96b/.github/workflows/regenerate.yml#L31>,
and the .asf.yaml file in the asf-site branch indicates that the asf-site
<https://github.com/apache/ozone-site/blob/2a519d63500e52b8ebeb20ebe4fb88afaea8c96b/.github/workflows/regenerate.yml#L31>
branch is the one to publish. Both files have the branch to work with
hardcoded in them, I’ve linked directly to those lines here. They do not
read GitHub’s default branch, so the publishing process should work without
changes if the default branch is updated in GitHub.

The source code used to build the website will continue to be committed to
master.

Ethan

On Tue, Feb 13, 2024 at 9:00 PM Ayush Saxena  wrote:

> +1
>
> -Ayush
>
> > On 14-Feb-2024, at 10:12 AM, Sumit Agrawal 
> > 
> wrote:
> >
> > Hi,
> >
> > The contents of asf-site are automatically *generated from master and
> >> committed*
> >> by a GitHub Action
> >> <
> >>
> https://github.com/apache/ozone-site/blob/master/.github/workflows/regenerate.yml
> >>> .
> >> From there, existing ASF services read the .asf.yml
> >> <https://github.com/apache/ozone-site/blob/asf-site/.asf.yaml> file in
> the
> >> asf-site branch and copy the built contents from that branch to wherever
> >> the ASF is hosting the static sites for projects.
> >
> >
> > Does changing default to master still need some generation and commit to
> > master?
> >
> > If the above has no impact, I'm +1 for this change.
> >
> >
> >> On Mon, Feb 12, 2024 at 9:47 PM Zita Dombi 
> wrote:
> >>
> >> Hi,
> >>
> >> Thanks Ethan for bringing this up, I'm +1 for this change.
> >>
> >> Zita
> >>
> >> Abhishek Pal  ezt írta (időpont: 2024.
> >> febr. 10., Szo, 22:43):
> >>
> >>> Hi Ethan,
> >>> Thanks for taking up this initiative.
> >>> While this is not a problem for existing committers, I do believe
> people
> >>> who are new to the repo might have some confusion with the current
> >>> branching and how GitHub actions builds the site.
> >>> I give a +1 vote for this change.
> >>> Though we are eventually shifting to a new website, that might take
> time,
> >>> and in the meantime this change will help reduce confusion for any new
> >>> contributors as well as address the templating issues.
> >>>
> >>>> On Fri, 9 Feb 2024 at 05:44, Ethan Rose  wrote:
> >>>
> >>>> Hi Ozone devs,
> >>>>
> >>>> I’d like to start a vote thread to change the default branch in the
> >>>> apache/ozone-site <https://github.com/apache/ozone-site> repo from
> >>>> asf-site
> >>>> to master. Changing the default branch requires an Infra ticket and
> >>> mailing
> >>>> thread according to the asfyaml README
> >>>> <
> >>>>
> >>>
> >>
> https://github.com/apache/infrastructure-asfyaml/blob/main/README.md#default-branch
> >>>>> .
> >>>> I’ll start with some questions you may have when deciding to vote:
> >>>>
> >>>> *Does this have anything to do with the new website development that
> is
> >>>> happening on the feature branch HDDS-9225-website-v2
> >>>> <https://github.com/apache/ozone-site/tree/HDDS-9225-website-v2>?*
> >>>>
> >>>> No, this has nothing to do with the new website. The change would be
> >>>> effective for the existing website only since it concerns the asf-site
> >>> and
> >>>> master branches, neither of which the new website uses right now.
> >>>>
> >>>> *What is the difference between asf-site and master?*
> >>>>
> >>>> The master branch contains the code that we modify and commit to
> change
> >>> the
> >>>> website. The asf-site branch contains the already built website. The
> >>>> contents of asf-site are automatically generated from master and
> >>> committed
> >>&

Re: [VOTE] Change the default branch for ozone-site from asf-site to master

2024-02-22 Thread Ethan Rose
Thanks everyone for voting. After running for 2 weeks the vote has passed
with:
13 +1s (including 7 binding PMC +1s)
No -1s
No 0s

I will create an infra ticket to change the branch and provide updates on
this thread.

Ethan

On Thu, Feb 22, 2024 at 9:37 AM Sadanand Shenoy  wrote:

> +1
>
> Thanks,
> Sadanand
>
> On Thu, Feb 22, 2024 at 10:34 PM swaminathan balachandran <
> swamirishi...@gmail.com> wrote:
>
> > +1
> > Thanks for explaining the problem.
> >
> > On Thu, Feb 22, 2024 at 1:17 AM Nandakumar Vadivelu
> >  wrote:
> >
> > > +1
> > > Thanks for the detailed description Ethan.
> > >
> > > > On 21-Feb-2024, at 10:13 PM, Arpit Agarwal
> > 
> > > wrote:
> > > >
> > > > +1
> > > >
> > > > Thanks for the well-written description Ethan. I missed this thread
> > > earlier.
> > > >
> > > > On Feb 20, 2024 at 10:21:38 PM, Dinesh Chitlangia <
> dine...@apache.org>
> > > > wrote:
> > > >
> > > >> +1
> > > >>
> > > >> Thanks,
> > > >> Dinesh
> > > >>
> > > >> On Thu, Feb 8, 2024 at 7:14 PM Ethan Rose  wrote:
> > > >>
> > > >> Hi Ozone devs,
> > > >>
> > > >>
> > > >> I’d like to start a vote thread to change the default branch in the
> > > >>
> > > >> apache/ozone-site <https://github.com/apache/ozone-site> repo from
> > > >>
> > > >> asf-site
> > > >>
> > > >> to master. Changing the default branch requires an Infra ticket and
> > > mailing
> > > >>
> > > >> thread according to the asfyaml README
> > > >>
> > > >> <
> > > >>
> > > >>
> > > >>
> > >
> >
> https://github.com/apache/infrastructure-asfyaml/blob/main/README.md#default-branch
> > > >>
> > > >>> .
> > > >>
> > > >> I’ll start with some questions you may have when deciding to vote:
> > > >>
> > > >>
> > > >> *Does this have anything to do with the new website development that
> > is
> > > >>
> > > >> happening on the feature branch HDDS-9225-website-v2
> > > >>
> > > >> <https://github.com/apache/ozone-site/tree/HDDS-9225-website-v2>?*
> > > >>
> > > >>
> > > >> No, this has nothing to do with the new website. The change would be
> > > >>
> > > >> effective for the existing website only since it concerns the
> asf-site
> > > and
> > > >>
> > > >> master branches, neither of which the new website uses right now.
> > > >>
> > > >>
> > > >> *What is the difference between asf-site and master?*
> > > >>
> > > >>
> > > >> The master branch contains the code that we modify and commit to
> > change
> > > the
> > > >>
> > > >> website. The asf-site branch contains the already built website. The
> > > >>
> > > >> contents of asf-site are automatically generated from master and
> > > committed
> > > >>
> > > >> by a GitHub Action
> > > >>
> > > >> <
> > > >>
> > > >>
> > > >>
> > >
> >
> https://github.com/apache/ozone-site/blob/master/.github/workflows/regenerate.yml
> > > >>
> > > >>> .
> > > >>
> > > >> From there, existing ASF services read the .asf.yml
> > > >>
> > > >> <https://github.com/apache/ozone-site/blob/asf-site/.asf.yaml> file
> > in
> > > the
> > > >>
> > > >> asf-site branch and copy the built contents from that branch to
> > wherever
> > > >>
> > > >> the ASF is hosting the static sites for projects.
> > > >>
> > > >>
> > > >> *Why should we change the default branch from asf-site to master?*
> > > >>
> > > >>
> > > >>   1. (My primary motivation) Pull request templates only work if
> they
> > > >>
> > > >> are committed
> > > >>
> > > >>   to the default branch
> > > >>
> > > >>   &

Re: [VOTE] Change the default branch for ozone-site from asf-site to master

2024-02-22 Thread Ethan Rose
Follow https://issues.apache.org/jira/browse/INFRA-25530 if you would like
more updates on this change.

On Thu, Feb 22, 2024 at 1:21 PM Ethan Rose  wrote:

> Thanks everyone for voting. After running for 2 weeks the vote has passed
> with:
> 13 +1s (including 7 binding PMC +1s)
> No -1s
> No 0s
>
> I will create an infra ticket to change the branch and provide updates on
> this thread.
>
> Ethan
>
> On Thu, Feb 22, 2024 at 9:37 AM Sadanand Shenoy 
> wrote:
>
>> +1
>>
>> Thanks,
>> Sadanand
>>
>> On Thu, Feb 22, 2024 at 10:34 PM swaminathan balachandran <
>> swamirishi...@gmail.com> wrote:
>>
>> > +1
>> > Thanks for explaining the problem.
>> >
>> > On Thu, Feb 22, 2024 at 1:17 AM Nandakumar Vadivelu
>> >  wrote:
>> >
>> > > +1
>> > > Thanks for the detailed description Ethan.
>> > >
>> > > > On 21-Feb-2024, at 10:13 PM, Arpit Agarwal
>> > 
>> > > wrote:
>> > > >
>> > > > +1
>> > > >
>> > > > Thanks for the well-written description Ethan. I missed this thread
>> > > earlier.
>> > > >
>> > > > On Feb 20, 2024 at 10:21:38 PM, Dinesh Chitlangia <
>> dine...@apache.org>
>> > > > wrote:
>> > > >
>> > > >> +1
>> > > >>
>> > > >> Thanks,
>> > > >> Dinesh
>> > > >>
>> > > >> On Thu, Feb 8, 2024 at 7:14 PM Ethan Rose 
>> wrote:
>> > > >>
>> > > >> Hi Ozone devs,
>> > > >>
>> > > >>
>> > > >> I’d like to start a vote thread to change the default branch in the
>> > > >>
>> > > >> apache/ozone-site <https://github.com/apache/ozone-site> repo from
>> > > >>
>> > > >> asf-site
>> > > >>
>> > > >> to master. Changing the default branch requires an Infra ticket and
>> > > mailing
>> > > >>
>> > > >> thread according to the asfyaml README
>> > > >>
>> > > >> <
>> > > >>
>> > > >>
>> > > >>
>> > >
>> >
>> https://github.com/apache/infrastructure-asfyaml/blob/main/README.md#default-branch
>> > > >>
>> > > >>> .
>> > > >>
>> > > >> I’ll start with some questions you may have when deciding to vote:
>> > > >>
>> > > >>
>> > > >> *Does this have anything to do with the new website development
>> that
>> > is
>> > > >>
>> > > >> happening on the feature branch HDDS-9225-website-v2
>> > > >>
>> > > >> <https://github.com/apache/ozone-site/tree/HDDS-9225-website-v2>?*
>> > > >>
>> > > >>
>> > > >> No, this has nothing to do with the new website. The change would
>> be
>> > > >>
>> > > >> effective for the existing website only since it concerns the
>> asf-site
>> > > and
>> > > >>
>> > > >> master branches, neither of which the new website uses right now.
>> > > >>
>> > > >>
>> > > >> *What is the difference between asf-site and master?*
>> > > >>
>> > > >>
>> > > >> The master branch contains the code that we modify and commit to
>> > change
>> > > the
>> > > >>
>> > > >> website. The asf-site branch contains the already built website.
>> The
>> > > >>
>> > > >> contents of asf-site are automatically generated from master and
>> > > committed
>> > > >>
>> > > >> by a GitHub Action
>> > > >>
>> > > >> <
>> > > >>
>> > > >>
>> > > >>
>> > >
>> >
>> https://github.com/apache/ozone-site/blob/master/.github/workflows/regenerate.yml
>> > > >>
>> > > >>> .
>> > > >>
>> > > >> From there, existing ASF services read the .asf.yml
>> > > >>
>> > > >> <https://github.com/apache/ozone-site/blob/asf-site/.asf.yaml>
>> file
>> > in
>> > &

Re: Legal compliance for used cryptography - exploration/design phase

2024-02-26 Thread Ethan Rose
I saw this message in my inbox, but the issue may be that it did not come
from an @apache email address, but I have a similar name with @apache in my
gmail contacts. In this case gmail may flag it as phishing. I also have
multiple sender addresses configured and sometimes forget to switch to
my @apache email before sending things out, which may result in it being
flagged as spam for others.

On Thu, Feb 22, 2024 at 8:47 PM Wei-Chiu Chuang  wrote:

> Hi Pifta, I don't know what's going on, but most of your email threads went
> into my spam folder. I wonder if it's happening to other folks.
>
> On Thu, Feb 1, 2024 at 2:28 AM István Fajth  wrote:
>
> > Hi developers!
> >
> > I have filed https://issues.apache.org/jira/browse/HDDS-10234 in order
> to
> > track efforts that are required to make Ozone compliant with certain
> > cryptography related legislation that are dictated by different
> governments
> > as a minimum requirement in order to enable to use of Ozone within an
> > environment, where certain security requirements are enforced by these
> > laws.
> >
> > I am aware of 3 jurisdictions, that has, or forms such legislation, the
> US
> > and Canada has the Federal Information Processing Standard, and the
> Federal
> > Information Management Federal Information Security Management Act; there
> > is China's Cryptography Law; and the European Union is also preparing
> > legislation on cryptography related rules.
> > Besides all of these legislations, there is also an international
> standard
> > defined related to the application of cryptography under ISO/IEC 19970,
> > unfortunately I do not have access to this standard as it is behind a
> > paywall though.
> >
> > I am happy to have any insight and would like to open a discussion soon
> by
> > posting a design doc on suggested changes to make it easy to have Ozone
> > running in an environment where FIPS/FISMA compliance is enforced by
> law. I
> > would especially be glad to have input on those parts of the design that
> > are relevant and should expect some specifics when it comes to compliance
> > with other jurisdictions, but of course any other feedback I accept
> gladly.
> >
> > I will send a notification in this thread once the design doc is up,
> since
> > then there are some preliminary details and background in the JIRA and
> > related JIRAs available from the one I linked in the beginning of this
> > e-mail.
> >
> > Thank you!
> > Pifta
> >
>


Re: [VOTE] Change the default branch for ozone-site from asf-site to master

2024-02-29 Thread Ethan Rose
INFRA-25530 is resolved and the default branch for apache/ozone-site is now
master. The PR template is merged there as well and is now effective for
pull requests across the repo.

Thanks all for your participation,

Ethan

On Thu, Feb 22, 2024 at 1:49 PM Ethan Rose  wrote:

> Follow https://issues.apache.org/jira/browse/INFRA-25530 if you would
> like more updates on this change.
>
> On Thu, Feb 22, 2024 at 1:21 PM Ethan Rose  wrote:
>
>> Thanks everyone for voting. After running for 2 weeks the vote has passed
>> with:
>> 13 +1s (including 7 binding PMC +1s)
>> No -1s
>> No 0s
>>
>> I will create an infra ticket to change the branch and provide updates on
>> this thread.
>>
>> Ethan
>>
>> On Thu, Feb 22, 2024 at 9:37 AM Sadanand Shenoy 
>> wrote:
>>
>>> +1
>>>
>>> Thanks,
>>> Sadanand
>>>
>>> On Thu, Feb 22, 2024 at 10:34 PM swaminathan balachandran <
>>> swamirishi...@gmail.com> wrote:
>>>
>>> > +1
>>> > Thanks for explaining the problem.
>>> >
>>> > On Thu, Feb 22, 2024 at 1:17 AM Nandakumar Vadivelu
>>> >  wrote:
>>> >
>>> > > +1
>>> > > Thanks for the detailed description Ethan.
>>> > >
>>> > > > On 21-Feb-2024, at 10:13 PM, Arpit Agarwal
>>> > 
>>> > > wrote:
>>> > > >
>>> > > > +1
>>> > > >
>>> > > > Thanks for the well-written description Ethan. I missed this thread
>>> > > earlier.
>>> > > >
>>> > > > On Feb 20, 2024 at 10:21:38 PM, Dinesh Chitlangia <
>>> dine...@apache.org>
>>> > > > wrote:
>>> > > >
>>> > > >> +1
>>> > > >>
>>> > > >> Thanks,
>>> > > >> Dinesh
>>> > > >>
>>> > > >> On Thu, Feb 8, 2024 at 7:14 PM Ethan Rose 
>>> wrote:
>>> > > >>
>>> > > >> Hi Ozone devs,
>>> > > >>
>>> > > >>
>>> > > >> I’d like to start a vote thread to change the default branch in
>>> the
>>> > > >>
>>> > > >> apache/ozone-site <https://github.com/apache/ozone-site> repo
>>> from
>>> > > >>
>>> > > >> asf-site
>>> > > >>
>>> > > >> to master. Changing the default branch requires an Infra ticket
>>> and
>>> > > mailing
>>> > > >>
>>> > > >> thread according to the asfyaml README
>>> > > >>
>>> > > >> <
>>> > > >>
>>> > > >>
>>> > > >>
>>> > >
>>> >
>>> https://github.com/apache/infrastructure-asfyaml/blob/main/README.md#default-branch
>>> > > >>
>>> > > >>> .
>>> > > >>
>>> > > >> I’ll start with some questions you may have when deciding to vote:
>>> > > >>
>>> > > >>
>>> > > >> *Does this have anything to do with the new website development
>>> that
>>> > is
>>> > > >>
>>> > > >> happening on the feature branch HDDS-9225-website-v2
>>> > > >>
>>> > > >> <https://github.com/apache/ozone-site/tree/HDDS-9225-website-v2
>>> >?*
>>> > > >>
>>> > > >>
>>> > > >> No, this has nothing to do with the new website. The change would
>>> be
>>> > > >>
>>> > > >> effective for the existing website only since it concerns the
>>> asf-site
>>> > > and
>>> > > >>
>>> > > >> master branches, neither of which the new website uses right now.
>>> > > >>
>>> > > >>
>>> > > >> *What is the difference between asf-site and master?*
>>> > > >>
>>> > > >>
>>> > > >> The master branch contains the code that we modify and commit to
>>> > change
>>> > > the
>>> > > >>
>>> > > >> website. The asf-site branch contains the already built website.
>>> The
>>> > > >>
>>> > > 

Re: Improving the Apache Ozone Website and Documentation

2024-02-29 Thread Ethan Rose
It’s been about a month since my last website status update. In the time
I’ve found to work on this, most of my focus has been on providing a smooth
developer experience to better facilitate contributions, with some minor
housekeeping items on the side. Here’s what has happened since then:

   -

   *We have the first docs page on the new website:*
   
https://ozone-site-v2.staged.apache.org/docs/developer-guide/project/release-guide
   Thanks Attila for helping to port this over from Confluence. Check this
   out to get an idea of what pages on the new website will look like when
   rendered and some of the features docusaurus provides to make pages easier
   to read.
   -

   *A CI pipeline for the website is in progress.*
   I have a POC on my fork that is being ported over to the main site in
   smaller PRs. This includes spell checking, markdown format checks, license
   header checks, building and running the site, and more. See HDDS-9601
   <https://issues.apache.org/jira/browse/HDDS-9601> for an update on the
   progress. Once HDDS-10254
   <https://issues.apache.org/jira/browse/HDDS-10254> HDDS-9866
   <https://issues.apache.org/jira/browse/HDDS-9866> HDDS-9567
   <https://issues.apache.org/jira/browse/HDDS-9567> and HDDS-10351
   <https://issues.apache.org/jira/browse/HDDS-10351> are completed, we
   should have solid checks to maintain website format across the anticipated
   hundreds of pages. Thanks Attila for reviewing these changes!
   *HDDS-9569 <https://issues.apache.org/jira/browse/HDDS-9569> HDDS-9570
   <https://issues.apache.org/jira/browse/HDDS-9570> HDDS-10354
   <https://issues.apache.org/jira/browse/HDDS-10354> Are nice to have, but
   not required. I am not planning on taking them up right now but if there
   are volunteers interested, the current committed CI pipeline is complete
   enough to plug them in pretty easily*
   -

   *Social media preview is fixed*
   I fixed some bugs related to the social media preview on the site and
   our new social card, tagline, and page titles should render when a link to
   the staging site is shared on social media or messaging apps. Try sending
   https://ozone-site-v2.staged.apache.org/ to yourself on slack to see how
   it looks.
   -

   *Repo-wide default branch change*
   There was a recent thread on changing the default branch of the website
   repo from asf-site to master. That work has completed, and has enabled a PR
   template to be provided for all new PRs across the repo, including those to
   the website feature branch. No other impact is expected to the new or
   existing website.

With the finish line in sight for website framework (HDDS-9538
<https://issues.apache.org/jira/browse/HDDS-9538>) and GitHub integration (
HDDS-9601 <https://issues.apache.org/jira/browse/HDDS-9601>), I would like
to call out some risk areas I’m seeing moving into the next phase of this
project:

   -

   Up to date document translations
   Maintaining accurate translations is a challenge for any website.
   Docusaurus has support for i18n, but that only gives us the ability to add
   translations. The burden of ensuring they remain current and accurate is
   still on us. Crowdin <https://crowdin.com/> may be an option to help us
   here, but it seems Pulsar already tried that with poor results
   <https://github.com/apache/pulsar/discussions/17810>. If anyone has
   ideas for tracking translation efforts please submit them.
   -

   Professional looking homepage
   This is the only visual part of the website where Docusaurus does not
   provide us much help. I’m not sure we have any current Ozone contributors
   who are very familiar with design to help us create a homepage, but Will
   Xiao and myself are trying our best in PR 65
   <https://github.com/apache/ozone-site/pull/65>. Please submit your
   design ideas if you have them. It's always good to have options.
   -

   Docs content itself
   The progress script
   <https://github.com/apache/ozone-site/blob/HDDS-9225-website-v2/progress.sh>
   tells us that there are 199 incomplete pages stubbed out on the website
   currently. I estimate this number will grow to between 250-300 as pages are
   added and split into smaller ones. We will need some more experienced
   members of the community to help with some particularly challenging
   sections like bare metal secure installation, but there are plenty of more
   accessible pages on the website to fill in as well.

Overall I feel the project is steadily progressing and thank you all for
your support.

Ethan

On Wed, Jan 24, 2024 at 5:17 PM Ethan Rose  wrote:

> Hi all, I just wanted to provide some updates on the new website. I will
> try to send out these update emails more regularly in this thread to
> summarize progress.
>
> - We have a staging website deployed at
> https://ozone-site-v2.staged.apache.org. This is also linked from th

Re: [DISCUSS] Ozone 1.4.1 Release

2024-04-08 Thread Ethan Rose
+1 for a 1.4.1 release. Thanks Xi Chen for volunteering as release manager!

Ethan

On Sun, Apr 7, 2024 at 8:38 PM Ivan Andika
 wrote:

>  +1. Since ETag feature already landed on ozone-1.4 branch. I also propose
> to include ETag related fixes / improvements in version 1.4.1.
> HDDS-9680. Use md5 hash of multipart object part's content as ETag
> (#5668)HDDS-10395. Fix eTag compatibility issues during MPU
> (#6235)HDDS-10403. CopyObject should set ETag based on the key content
> (#6251)HDDS-10521. ETag field should not be returned during GetObject if
> the key does not contain ETag field (#6377)HDDS-10587. Reset the
> thread-local MessageDigest instance during exception (#6435)
> All these fixes have been backported to our internal cluster and currently
> they are working well.
> On Monday, April 8, 2024 at 06:52:43 AM GMT+8, Tsz Wo Sze <
> szets...@gmail.com> wrote:
>
>  +1  It is good to release Ozone 1.4.1.  Let's also update the Ratis
> version.
>
> Tsz-Wo
>
> On Sun, Apr 7, 2024 at 4:28 AM wanghongbing 
> wrote:
>
> > +1, Releasing 1.4.1 is meaningful.
> >
> >
> > > On Apr 3, 2024, at 13:15, Sammi Chen  wrote:
> > >
> > > Dear Ozone Devs,
> > >
> > > We have released 1.4.0 on Jan 19th.
> > > Now there are 51 new commits already landed on 1.4.0 branch
> > >
> >
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20HDDS%20AND%20fixVersion%20%3D%201.4.1
> > >
> > > and 21 targeting 1.4.1, but not landed on the branch yet,
> > >
> >
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20HDDS%20AND%20%22Target%20Version%2Fs%22%20%3D%201.4.1%20AND%20fixVersion%20!%3D%201.4.1%20ORDER%20BY%20resolved%20ASC
> > >
> > > I would like to propose starting the 1.4.1 release now.  What do you
> > think?
> > >
> > > And Xi Chen has volunteered to be the RM of 1.4.1.
> > >
> > > Regards,
> > > Sammi
> >
> >
>


Re: [DISCUSS] Ozone 1.4.1 Release

2024-05-08 Thread Ethan Rose
Doing the next release from master makes sense to me given the increased
focus on bug fixes and stability. I don't think I have a preference on
calling it 1.4.1 or 1.5.0. It might be strange to have an ozone-1.4 branch
and then an ozone-1.4.1 branch that is not based off it, but that's not a
big deal. If 1.4.1 makes more sense from a semver perspective with what is
being delivered then let's use that label.

Ethan

On Tue, May 7, 2024 at 10:38 AM Attila Doroszlai 
wrote:

> Hi Ozone developers,
>
> I would like to propose preparing the next release based on master
> instead of the ozone-1.4 branch.
>
> Unlike with previous releases, we started regularly backporting fixes
> to ozone-1.4 soon after Ozone 1.4.0 was released.  Currently 99
> commits are present on the branch on top of 1.4.0.  There are 11
> further issues targeted at 1.4.1 but not yet backported.
>
> However, there are 526 additional commits on master, not (yet)
> targeted at 1.4.1.
> (excludes 3 reverts, 2 addendums and 2 post-release commits)
>
> There were ~60 dependency version updates and ~165 changes specific to
> tests and CI.  We might consider skipping these, but that would make
> backporting the rest much more difficult due to conflicts.
>
> The majority of the remaining ~300 commits are fixes/improvements
> important at least to some users/developers (e.g. the ones who
> reported and/or worked on the issues).
>
> No big features have been introduced since 1.4.0, so I would expect a
> release from the master branch to be stable.  Such features are being
> developed on separate branches.  This also makes it possible to still
> call the next release 1.4.1.
>
> Some changes that may need special consideration (e.g. revert from the
> release branch or additional testing/validation):
> - HDDS-8113. Remove Hadoop 2.7 compatibility hack
> - HDDS-815. Rename HDDS config keys prefixed with dfs
>   + HDDS-10331. Rename Java constants of ex-DFS config keys
> - HDDS-7791. Support key ownership
> - HDDS-9648. Create API to fetch info about a single datanode
> - HDDS-9343. Shift sortDatanodes logic to OM (7 sub-tasks, 2 still to be
> done)
> - HDDS-10538. Replace GSON with Jackson (6 sub-tasks, 2 still to be done)
>
> Please let me know your thoughts.
>
> thanks,
> Attila
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> For additional commands, e-mail: dev-h...@ozone.apache.org
>
>


Re: [VOTE] Merge HDDS-10656-atomic-key-overwrite into master

2024-07-09 Thread Ethan Rose
Hi all, after reviewing the proposal I noticed this in the documentation
section:
>
> The new API is for developers to build upon.  Not intended for end-users
> or administrators.
>

But there is a CLI  under 'ozone
sh' which looks like the type of thing users and admins can and will use.
The CLI addition LGTM but I think we should add this to the docs.
Additionally the help message for the new "rewrite" command should probably
specify something about atomicity of the rewrite, similar to the javadoc on
OzoneBucket#rewriteKey

.

We can add this easily as a follow up now that the merge is done.

Ethan

On Tue, Jul 9, 2024 at 5:29 AM Attila Doroszlai 
wrote:

> > I would like to propose merging into master the feature branch
> > HDDS-10656-atomic-key-overwrite, which was used to develop Atomic Key
> > Overwrite.
>
> Vote passed after 2 weeks with 3 binding +1s (Ayush, Stephen, Tsz-Wo),
> 6 other +1s, no other votes.
>
> Thanks everyone for voting.
>
> -Attila
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> For additional commands, e-mail: dev-h...@ozone.apache.org
>
>


Re: [VOTE] Merge HDDS-7593 (HSync and lease recovery) into master

2024-07-25 Thread Ethan Rose
Thanks for all the work on this. Looks good overall, just a few
questions on compatibility:

It looks like this line in the merge checklist was not updated. Can you go
into more details about the OM compatibility for lease recovery or other
operations?

> A new OM version number was introduced to prevent new client sending
> atomic key overwrite request to old OM which does not support this feature.


Additionally, new DataNode layout version "HBASE_SUPPORT" was added.

Can you add some details about what changes on the disk layout after this
feature is finalized? Is this related to the incremental chunk list or
something more? Perhaps a more descriptive layout feature name would help
here as well.

Ethan

On Wed, Jul 24, 2024 at 11:43 PM Wei-Chiu Chuang  wrote:

> I am +1 (binding)
> On Tue, Jul 23, 2024 at 4:05 AM Ashish kumar 
> wrote:
>
> > Hi Ozone developers,
> >
> > I would like to propose merging HDDS-7593 (HSync and lease recovery)
> > feature branch into master.
> >
> > This feature is to support HSync and lease recovery,
> > which enables HBase to run on Ozone.
> > More details about the feature are present in design documents attached
> > in the below mentioned Ozone confluence page link.
> >
> >
> > Checklist for feature branch merge:
> >
> >
> https://cwiki.apache.org/confluence/display/OZONE/Supporting+HSync+and+lease+recovery+-+HDDS-7593
> >
> > Feature Jira Link:
> > https://issues.apache.org/jira/browse/HDDS-7593
> >
> > This vote will be open for at least a week.
> >
> > Thanks,
> > Ashish Kumar
> >
>


Re: [VOTE] Merge HDDS-7593 (HSync and lease recovery) into master

2024-07-29 Thread Ethan Rose
Hi Ashish, thanks for the updates.

Regarding this:

> In the old design, lease recovery was dependent only on the client and OM,
> but now it involves datanode as well.

I'm not quite following because the master branch already has an OM layout
feature called "hsync" that is finalized for any deployments of master
since May of last year when initial hsync development was happening on
master. On the branch I don't see any new versions on the OM or datanode
related to lease recovery.

FYI your gmail address was going to my spam folder, this may have happened
for others as well. Since you are a committer can you use your apache email
to send messages?

On Mon, Jul 29, 2024 at 4:05 AM Ashish kumar 
wrote:

> Thanks Ethan for looking into this.
>
> >>It looks like this line in the merge checklist was not updated.
> Updated the checklist.
>
> >> Can you go
> into more details about the OM compatibility for lease recovery or other
> operations?
> Lease recovery is completely redesigned and so both client and server needs
> to be upgraded to make it work correctly.
> In the old design, lease recovery was dependent only on the client and OM,
> but now it involves datanode as well.
> Apart from this compatibility is related to "Incremental chunk list" which
> is already taken care of.
>
> >> DataNode layout version "HBASE_SUPPORT"
> Yes this is only related to incremental chunk list. We will update with a
> more meaningful name as HBASE_INCREMENTAL_CHUNK_SUPPORT.
> Also will update the merge checklist in more detail about this.
>
> Thanks,
> Ashish
>
> On Fri, Jul 26, 2024 at 12:50 AM Ethan Rose  wrote:
>
> > Thanks for all the work on this. Looks good overall, just a few
> > questions on compatibility:
> >
> > It looks like this line in the merge checklist was not updated. Can you
> go
> > into more details about the OM compatibility for lease recovery or other
> > operations?
> >
> > > A new OM version number was introduced to prevent new client sending
> > > atomic key overwrite request to old OM which does not support this
> > feature.
> >
> >
> > Additionally, new DataNode layout version "HBASE_SUPPORT" was added.
> >
> > Can you add some details about what changes on the disk layout after this
> > feature is finalized? Is this related to the incremental chunk list or
> > something more? Perhaps a more descriptive layout feature name would help
> > here as well.
> >
> > Ethan
> >
> > On Wed, Jul 24, 2024 at 11:43 PM Wei-Chiu Chuang 
> > wrote:
> >
> > > I am +1 (binding)
> > > On Tue, Jul 23, 2024 at 4:05 AM Ashish kumar 
> > > wrote:
> > >
> > > > Hi Ozone developers,
> > > >
> > > > I would like to propose merging HDDS-7593 (HSync and lease recovery)
> > > > feature branch into master.
> > > >
> > > > This feature is to support HSync and lease recovery,
> > > > which enables HBase to run on Ozone.
> > > > More details about the feature are present in design documents
> attached
> > > > in the below mentioned Ozone confluence page link.
> > > >
> > > >
> > > > Checklist for feature branch merge:
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/OZONE/Supporting+HSync+and+lease+recovery+-+HDDS-7593
> > > >
> > > > Feature Jira Link:
> > > > https://issues.apache.org/jira/browse/HDDS-7593
> > > >
> > > > This vote will be open for at least a week.
> > > >
> > > > Thanks,
> > > > Ashish Kumar
> > > >
> > >
> >
>


Re: [VOTE] Merge HDDS-7593 (HSync and lease recovery) into master

2024-07-30 Thread Ethan Rose
Hi Wei-Chiu, I'm still unclear on the compatibility requirements of this
feature.

Feature flags are not a substitute for layout versions. Layout versions are
monotonic: they make sure once something is enabled it cannot be disabled
via a downgrade (the downgraded cluster will fail to start). Feature flags
are not monotonic. If an old version defaults to false and a new version
defaults to true, the flag will get reverted on downgrade while data
written while it was true will still be present. Whether or not this causes
problems is dependent on the implementation of the feature.

Can you clarify what disk and protocol changes are part of this feature on
OM and Datanodes? Then we can work out what is actually required for
compatibility.

On Tue, Jul 30, 2024 at 11:43 AM Wei-Chiu Chuang  wrote:

> hi Ethan,
>
> The OM layout version change was for rejecting hsync requests to OM until
> the upgrade completes.
> The "HSYC" layout version was shipped in Ozone 1.4.0.
> IIRC I did touch upon this in the release vote thread, that because there's
> a feature flag ozone.fs.hsync.enabled which disables the feature entirely,
> the HSYNC layout version is essentially a no-op. We enabled the feature
> flag in the feature branch. If this is a concern we can make it disabled by
> default again.
>
> On Mon, Jul 29, 2024 at 1:05 AM Ashish kumar 
> wrote:
>
> > Thanks Ethan for looking into this.
> >
> > >>It looks like this line in the merge checklist was not updated.
> > Updated the checklist.
> >
> > >> Can you go
> > into more details about the OM compatibility for lease recovery or other
> > operations?
> > Lease recovery is completely redesigned and so both client and server
> needs
> > to be upgraded to make it work correctly.
> > In the old design, lease recovery was dependent only on the client and
> OM,
> > but now it involves datanode as well.
> > Apart from this compatibility is related to "Incremental chunk list"
> which
> > is already taken care of.
> >
> > >> DataNode layout version "HBASE_SUPPORT"
> > Yes this is only related to incremental chunk list. We will update with a
> > more meaningful name as HBASE_INCREMENTAL_CHUNK_SUPPORT.
> > Also will update the merge checklist in more detail about this.
> >
> > Thanks,
> > Ashish
> >
> > On Fri, Jul 26, 2024 at 12:50 AM Ethan Rose  wrote:
> >
> > > Thanks for all the work on this. Looks good overall, just a few
> > > questions on compatibility:
> > >
> > > It looks like this line in the merge checklist was not updated. Can you
> > go
> > > into more details about the OM compatibility for lease recovery or
> other
> > > operations?
> > >
> > > > A new OM version number was introduced to prevent new client sending
> > > > atomic key overwrite request to old OM which does not support this
> > > feature.
> > >
> > >
> > > Additionally, new DataNode layout version "HBASE_SUPPORT" was added.
> > >
> > > Can you add some details about what changes on the disk layout after
> this
> > > feature is finalized? Is this related to the incremental chunk list or
> > > something more? Perhaps a more descriptive layout feature name would
> help
> > > here as well.
> > >
> > > Ethan
> > >
> > > On Wed, Jul 24, 2024 at 11:43 PM Wei-Chiu Chuang 
> > > wrote:
> > >
> > > > I am +1 (binding)
> > > > On Tue, Jul 23, 2024 at 4:05 AM Ashish kumar <
> ashis.kr.2...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi Ozone developers,
> > > > >
> > > > > I would like to propose merging HDDS-7593 (HSync and lease
> recovery)
> > > > > feature branch into master.
> > > > >
> > > > > This feature is to support HSync and lease recovery,
> > > > > which enables HBase to run on Ozone.
> > > > > More details about the feature are present in design documents
> > attached
> > > > > in the below mentioned Ozone confluence page link.
> > > > >
> > > > >
> > > > > Checklist for feature branch merge:
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/OZONE/Supporting+HSync+and+lease+recovery+-+HDDS-7593
> > > > >
> > > > > Feature Jira Link:
> > > > > https://issues.apache.org/jira/browse/HDDS-7593
> > > > >
> > > > > This vote will be open for at least a week.
> > > > >
> > > > > Thanks,
> > > > > Ashish Kumar
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] Merge HDDS-7593 (HSync and lease recovery) into master

2024-07-30 Thread Ethan Rose
Thanks Wei-Chiu, I'm ok with handling these in follow up tasks. Please
share the Jira links when you have them. This was my main concern, so the
merge is +1 from me now.

Ethan

On Tue, Jul 30, 2024 at 5:34 PM Wei-Chiu Chuang  wrote:

> I had an offline discussion with Ethan regarding a couple of potential
> issues:
>
>1.
>
>*New Client with ozone.fs.hsync.enabled*: If a new client with
>ozone.fs.hsync.enabled sends an hsync request to an old OM, the request
>is sent as a KeyCommit request with additional fields for hsync. The old
>OM, not recognizing these fields, would treat the request as a normal
>KeyCommit (i.e., closing the key). The outcome of this is uncertain.
>2.
>
>*Downgrade Scenario*: If a cluster is upgraded from Ozone 1.4 to 1.5 and
>the feature flag is enabled, Ozone would allow hsync operations before
> the
>upgrade is finalized. This is problematic because version 1.4 does not
>fully support this feature.
>
> Both issues can be addressed with new layout versions. I will open JIRA
> tickets to include these fixes in the next release, version 1.5.0.
>
> I hope this explanation makes sense.
>
> On Tue, Jul 30, 2024 at 10:20 AM Ethan Rose  wrote:
>
> > Hi Wei-Chiu, I'm still unclear on the compatibility requirements of this
> > feature.
> >
> > Feature flags are not a substitute for layout versions. Layout versions
> are
> > monotonic: they make sure once something is enabled it cannot be disabled
> > via a downgrade (the downgraded cluster will fail to start). Feature
> flags
> > are not monotonic. If an old version defaults to false and a new version
> > defaults to true, the flag will get reverted on downgrade while data
> > written while it was true will still be present. Whether or not this
> causes
> > problems is dependent on the implementation of the feature.
> >
> > Can you clarify what disk and protocol changes are part of this feature
> on
> > OM and Datanodes? Then we can work out what is actually required for
> > compatibility.
> >
> > On Tue, Jul 30, 2024 at 11:43 AM Wei-Chiu Chuang 
> > wrote:
> >
> > > hi Ethan,
> > >
> > > The OM layout version change was for rejecting hsync requests to OM
> until
> > > the upgrade completes.
> > > The "HSYC" layout version was shipped in Ozone 1.4.0.
> > > IIRC I did touch upon this in the release vote thread, that because
> > there's
> > > a feature flag ozone.fs.hsync.enabled which disables the feature
> > entirely,
> > > the HSYNC layout version is essentially a no-op. We enabled the feature
> > > flag in the feature branch. If this is a concern we can make it
> disabled
> > by
> > > default again.
> > >
> > > On Mon, Jul 29, 2024 at 1:05 AM Ashish kumar 
> > > wrote:
> > >
> > > > Thanks Ethan for looking into this.
> > > >
> > > > >>It looks like this line in the merge checklist was not updated.
> > > > Updated the checklist.
> > > >
> > > > >> Can you go
> > > > into more details about the OM compatibility for lease recovery or
> > other
> > > > operations?
> > > > Lease recovery is completely redesigned and so both client and server
> > > needs
> > > > to be upgraded to make it work correctly.
> > > > In the old design, lease recovery was dependent only on the client
> and
> > > OM,
> > > > but now it involves datanode as well.
> > > > Apart from this compatibility is related to "Incremental chunk list"
> > > which
> > > > is already taken care of.
> > > >
> > > > >> DataNode layout version "HBASE_SUPPORT"
> > > > Yes this is only related to incremental chunk list. We will update
> > with a
> > > > more meaningful name as HBASE_INCREMENTAL_CHUNK_SUPPORT.
> > > > Also will update the merge checklist in more detail about this.
> > > >
> > > > Thanks,
> > > > Ashish
> > > >
> > > > On Fri, Jul 26, 2024 at 12:50 AM Ethan Rose 
> wrote:
> > > >
> > > > > Thanks for all the work on this. Looks good overall, just a few
> > > > > questions on compatibility:
> > > > >
> > > > > It looks like this line in the merge checklist was not updated. Can
> > you
> > > > go
> > > > > into more details about the OM compatibility for lease recovery or
> > > other
>

Re: [VOTE] Apache Ozone 1.4.1 RC0

2024-08-06 Thread Ethan Rose
Hi Xi Chen, thanks for working on this release! I just have a few questions
about the changes.
- The RC has a minor Ratis version bump (3.0.1 to 3.1.0) in a patch Ozone
release. This is unusual. What is the reason for this? Any concerns?
- I see proto lock changes in this patch release, which is also unusual.
Can you elaborate on what fixes were brought in that made these changes?

Thanks
Ethan

On Tue, Aug 6, 2024 at 4:45 AM mrchenx  wrote:

> Dear Ozone Devs,
>We have released 1.4.0 on Jan 19th. Now there are 169 new commits
> already landed on 1.4.1 branch, Includes Ratis upgrade (upgrade to Ratis
> 3.1.0), some bug fixes, as well as performance optimizations, and some
> necessary dependencies.I am calling for a vote on Apache Ozone 1.4.1
> RC0.   - The RC0 tag can be found on Github at:
> - https://github.com/apache/ozone/releases/tag/ozone-1.4.1-RC0
>- 169 Jiras were cherry-pick for ozone-1.4.1
> -
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20HDDS%20AND%20fixVersion%20%3D%201.4.1
>- The source and binary tarballs can be found at:
> -  https://dist.apache.org/repos/dist/dev/ozone/1.4.1-rc0/
>- Maven artifacts are staged at:
> -
> https://repository.apache.org/content/repositories/orgapacheozone-1023/
>- The public key used to sign the artifacts can be found at:
> - https://dist.apache.org/repos/dist/dev/ozone/KEYS
>- The fingerprint of the key used to sign the artifacts is:
> - 0D8C19F5514E2786007936F758C87003FF9A1A38
>The vote will run for 7 days, ending on Aug 13th 2024 at 16:45 pm UTC+8.
>
> Thanks
>
> Xi Chen


Re: [VOTE] Apache Ozone 1.4.1 RC0

2024-08-12 Thread Ethan Rose
Hi Xi Chen, sorry for not getting back to this earlier.

The proto changes going out in the patch release are ok this time since
there has not been another major release yet. In general this will not
always be the case which is why backporting proto changes should be
avoided. For example if 1.5.0 went out with one set of changes, then we
released 1.4.2 and 1.5.1 later with a patch containing proto changes that
were not in 1.5.0 then 1.4.2 upgrade to 1.5.0 may no longer be compatible.
It could be seen as a “downgrade” in the proto spec that removes fields,
even though it is an Ozone version upgrade. In this specific case it should
be fine though, we just need to copy the new lockfiles back to master after
the release.

The GPG key has not been officially published to the release area, it is
still in dev. We need it in release before this release goes out so people
can verify the signatures from the official location. Yiyang’s key (the
previous release manager) is in the release KEYS file but not in the dev
KEYS file, so moving the current dev would erase it. If you add your key on
top of the latest KEYS file in release as specified in the release guide

and push that to dev I can do the final move to get it to release. The release
guide

does actually say to share the dev keys file in the vote mail which is a
mistake. I can update this as well. This doesn’t affect the content of the
current RC.

I think the current branch structure in GitHub is not quite right. There is
an ozone-1.4 branch 
and an ozone-1.4.1 branch
 but they both point
to the same commit. Since patch releases should just be cherry picks, we
shouldn’t need a branch for ozone-1.4.1 and it could actually cause future
patch releases on the 1.4 line to diverge if cherry picks are done there
but not to ozone-1.4. The branch name will also conflict with the final
ozone-1.4.1 tag that will be applied to the release commit and cause
confusion. I suggest deleting the ozone-1.4.1 branch. This also does not
affect the current RC.

The ozone-1.4.1-RC0 tag
 is pointing to
two commits that are after the ozone-1.4 and ozone-1.4.1 branches. This
gives the message: “This commit does not belong to any branch on this
repository, and may belong to a fork outside of the repository” when the
hash of the release is viewed in GitHub. Each patch release tag and RC
should point to the last commit on the ozone-1.4 branch at that time, so
for the next patch release on this line, more commits can be added to the
branch but the tag would remain in place. To fix this, we can fast-forward
the ozone-1.4 branch to include these commits, which should not change the
hash of the current RC.

The RC itself looks good to me:

   - Built from source.
   - Verified signatures against dev keys file (Will check again when we
   have it in the release area)
   - Verified checksums
   - Verified docs are included in the build and work in Chrome.
   - Verified output from ozone version
   - Tested ozone freon ockrw in docker.

I’m +1 on this release candidate, thanks for working on it! I think we just
have a few minor steps to do before shipping it out.

I’m also working on a PR to update the release guide in the new website for
handling regular patch releases. The current docs for that section assume
the patch release is for a CVE or critical bug and do not exactly translate
to a normal maintenance release.

Ethan

On Mon, Aug 12, 2024 at 9:57 AM Xi Chen  wrote:

> Hi Sammi,
>
> thanks for checking.
> For your question:
> Q: Have you pushed the artifacts to the apache Nexus? If so, could you
> share the link?
> A: Yes, I have pushed the artifacts to apache Nexus, the link is provided
> in the email:
> - Maven artifacts are staged at:
> -
> https://repository.apache.org/content/repositories/orgapacheozone-1023/
>
> Thanks,
> Xi Chen
>
> On 2024/08/12 10:11:36 Sammi Chen wrote:
> > Xi,
> >
> > Thanks for leading the effort of a new release. Have you pushed the
> > artifacts to the apache Nexus? If so, could you share the link?
> >
> >
> > Thanks,
> > Sammi
> >
> > On Tue, 6 Aug 2024 at 16:45, mrchenx  wrote:
> >
> > > Dear Ozone Devs,
> > >We have released 1.4.0 on Jan 19th. Now there are 169 new commits
> > > already landed on 1.4.1 branch, Includes Ratis upgrade (upgrade to
> Ratis
> > > 3.1.0), some bug fixes, as well as performance optimizations, and some
> > > necessary dependencies.I am calling for a vote on Apache Ozone
> 1.4.1
> > > RC0.   - The RC0 tag can be found on Github at:
> > > - https://github.com/apache/ozone/releases/tag/ozone-1.4.1-RC0
> > >- 169 Jiras were cherry-pick for ozone-1.4.1
> > 

Re: [VOTE] Apache Ozone 1.4.1 RC0

2024-08-13 Thread Ethan Rose
Yes actually we should include HDDS-11040 since it fixes a compatibility
regression in 1.4.0. Thanks Attila!

What should I do specifically, could we update this part in the document?
>
We need to cherry pick this commit

back
to the master branch after the 1.4.1 release goes out to make sure that
future releases from master are compatible with these proto changes. The
current instructions in the document are to do protolock and version update
on master before cutting a minor release branch. Usually we would not have
proto changes in a patch release and the extra copy would not be necessary.

So we only need a tag of ozone-1.4.1 instead of a branch of 1.4.1, right?
> New branches are only needed for major releases, but not for minor releases.
>
New branches would be for major and minor releases (first two semver
digits) but not patch releases like this one since they are just cherry
picks. This is something that can also be clarified in the doc.


> We can also add relevant steps before "push the release candidate tag to
> github" in our documentation.
>
Yes good point. It looks like if followed exactly the doc will instruct you
to push the tag but not the branch. This may cause them to diverge in the
remote if changes like version bumps were done to the branch locally
instead of with PRs.

What I need to do before continuing the ozone-1.4.1 RC0 release process?
>
Yes the steps you listed here look good.

Once the release goes out I'll update the release manager doc with all our
findings and we can review it.

Thanks
Ethan

On Tue, Aug 13, 2024 at 10:05 AM Attila Doroszlai 
wrote:

> Thanks Xi Chen for working on the release.
>
> Can we also include HDDS-11040?
>
> -Attila
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> For additional commands, e-mail: dev-h...@ozone.apache.org
>
>


Re: [VOTE] Apache Ozone 1.4.1 RC1

2024-08-16 Thread Ethan Rose
+1 (binding), thanks Xi Chen for updating.

- Verified checksums
- Verified signatures
- Built from source
- Checked `ozone version` output
- Checked docs are included in the artifact
- Tested `ozone freon ockrw` in docker
- Branch structure in GitHub LGTM

Ethan

On Fri, Aug 16, 2024 at 1:13 AM mrchenx  wrote:

> Dear Ozone Devs,As discussed in the last email, I am calling for a
> vote on Apache Ozone 1.4.1 RC1.
> We have released 1.4.0 on Jan 19th. Now there are 177 new commits
> already landed on 1.4.1 branch, Includes Ratis upgrade (upgrade to Ratis
> 3.1.0), some bug fixes, as well as performance optimizations, and some
> necessary dependencies.I am calling for a vote on Apache Ozone 1.4.1
> RC1.   - The RC1 tag can be found on Github at:
> - https://github.com/apache/ozone/releases/tag/ozone-1.4.1-RC1
>- 177 Jiras were cherry-pick for ozone-1.4.1
> -
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20HDDS%20AND%20fixVersion%20%3D%201.4.1
>- The source and binary tarballs can be found at:
> -  https://dist.apache.org/repos/dist/dev/ozone/1.4.1-rc1/
>- Maven artifacts are staged at:
> -
> https://repository.apache.org/content/repositories/orgapacheozone-1024
>- The public key used to sign the artifacts can be found at:
> - https://dist.apache.org/repos/dist/release/ozone/KEYS
>- The fingerprint of the key used to sign the artifacts is:
> - 0D8C19F5514E2786007936F758C87003FF9A1A38
>The vote will run for 7 days, ending on Aug 23th 2024 at 13:10 pm UTC+8.
>
> Thanks
>
> Xi Chen


Re: [DISCUSS] Apache Ozone 1.5.0 release

2024-08-22 Thread Ethan Rose
+1 to start planning for 1.5.0. Our minor releases tend to take some time
so it's good to start early. For any committers looking to gain more
experience volunteering as release manager is a great way to do that. We
have a release guide

to help out, which we keep updating based on feedback from each release.
Past RMs like myself can provide guidance as well.

Ethan

On Thu, Aug 22, 2024 at 3:09 PM Siddharth Wagle  wrote:

> +1 for a .0 release with full HBase support.
>
> Best,
> Sid
>
> On Thu, Aug 22, 2024 at 11:55 AM Wei-Chiu Chuang 
> wrote:
>
> > Hi community,
> >
> > I’d like to initiate a discussion about the 1.5.0 release. There have
> been
> > significant changes since 1.4.0 (January 2024) with 1,130 commits
> >  added to
> > the
> > git history.
> >
> > I think it’s a good time to start planning for the next minor release.
> > Wouldn't it be great if we could announce the release of 1.5.0 at this
> > year's CommunityOverCode conference?
> >
> > Some of the major features I’m excited to see included are:
> > * HBase support HDDS-7593 <
> https://issues.apache.org/jira/browse/HDDS-7593
> > >
> > * Snapshot support phase 2 HDDS-8544
> > 
> > * JDK17 support HDDS-8246 <
> https://issues.apache.org/jira/browse/HDDS-8246
> > >
> >
> > Improvements
> > * Recon UI HDDS-11153 
> > * CVE updates
> >
> > I'm sure there's more to look forward to, and that makes me even more
> > excited!
> >
> > Best,
> > Weichiu
> >
>


Re: [DISCUSS] Apache Ozone 2.0 release was: Re: [DISCUSS] Apache Ozone 1.5.0 release

2024-08-28 Thread Ethan Rose
Thanks a lot for driving this Wei-Chiu, I'm +1 for deprecating legacy
buckets and o3fs as well. The release plan will need to specify how these
deprecations are implemented and what the impacts will be. For example,
legacy buckets will likely remain readable and writable but creation of new
ones will be blocked. This will have impacts on clients older than version
1.2.0.

Other ideas to consider for a major release:

   - Removing STAND_ALONE replication type (it should be the same as RATIS
   1): HDDS-6218 
   - CLI cleanup such that all subcommands and flags use kebab-case and
   double dash for long options.
   -  This is a more involved project that we might not have time for.
  Ideally all current non-compliant commands would still work but remain
  hidden in help messages.
  - We would need an enforcement plan going forward to make sure new
  CLI options remain consistent. CI can get us part of the way
there but some
  sort of manual guideline document would also need to be drafted and
  enforced by reviewers.

Regarding proto2 to proto3 migration, Pifta had some ideas in the last
community sync that this may not actually be too complicated and we could
maintain wire compatibility with proto2 clients even if the server is using
proto3. I haven't done any research in this area yet so others can fill in
details if they are aware.

Given this is a major release I would like to see us move towards proper
change logs and release notes in the published announcement instead of the
haiku/picture thing, something I also lamented about in the new website
proposal
.
We can consider that practice "deprecated" too ; )

A new website debut with 2.0 would be great in theory but unfortunately our
docs contributions remain almost zero despite the framework being live for
over 6 months now, so that seems unlikely. I do have plans to take on the
homepage and community sections soon but going forward I can't write 200+
pages of technical docs on my own:

> $ ./progress.sh
> Total pages: 203
> 
> Complete pages: 1
> 
> Incomplete pages: 202


Ethan

On Wed, Aug 28, 2024 at 11:42 AM Wei-Chiu Chuang  wrote:

> We will keep OBS for sure.
>
> I'm coming up with a release plan. Will share it shortly.
>
> On Tue, Aug 27, 2024 at 11:50 PM Kohei Sugihara 
> wrote:
>
> > Hi, all
> >
> > > I'd also like to propose to deprecate
> > > LEGACY bucket type
> >
> > Can we keep using the OBS bucket type after the deprecation?
> >
> > 2024年8月28日(水) 8:32 Wei-Chiu Chuang :
> >
> > > I'd also like to propose to deprecate
> > >
> > > LEGACY bucket type
> > > O3fs file system
> > > IIRC we're still testing Hadoop 3.1 and 3.2 runtime. Can we drop it
> too?
> > >
> > > Anything else?
> > >
> > > On Sun, Aug 25, 2024 at 11:12 PM Guo Hao 
> wrote:
> > >
> > > > +1 for Ozone 2.0
> > > >
> > > >
> > > > At 2024-08-24 01:48:38, "Tsz Wo Sze"  wrote:
> > > > >+1 for Ozone 2.0
> > > > >
> > > > >> (1) I want to drop Hadoop2 support.
> > > > >
> > > > >It would be great if we can also replace proto 2 with proto 3.
> > However,
> > > > it
> > > > >will take some time for the replacement.  It probably has to wait
> till
> > > > >Ozone 3.0.  Hopefully, it won't be 4 more years.
> > > > >
> > > > >@Wei-Chiu Chuang  , thanks a lot for starting
> the
> > > > >discussion!
> > > > >
> > > > >Tsz-Wo
> > > > >
> > > > >
> > > > >On Thu, Aug 22, 2024 at 7:59 PM Wei-Chiu Chuang  >
> > > > wrote:
> > > > >
> > > > >> There's also Website v2 that's been cooking for quite a while now.
> > > > >> HDDS-9225
> > > > >> 
> > > > >> Good opportunity for optics.
> > > > >>
> > > > >> On Thu, Aug 22, 2024 at 4:19 PM Yi-Sheng Lien  >
> > > > wrote:
> > > > >>
> > > > >> > +1
> > > > >> >
> > > > >> > Wei-Chiu Chuang  於 2024年8月23日 週五
> > > 上午6:31
> > > > >> 寫道:
> > > > >> >
> > > > >> > > Hi,
> > > > >> > > I want to throw out another idea: we should be calling it 2.0
> > > > instead.
> > > > >> > >
> > > > >> > > Reason being:
> > > > >> > > (1) I want to drop Hadoop2 support.
> > > > >> > > (2) Ozone 1.0 was exactly 4 years ago. In many ways, its
> > maturity
> > > > and
> > > > >> > > functionalities have changed so much.
> > > > >> > > (3) The HBase support is a significant change.
> > > > >> > > (4) While Snapshot support was added in 1.4.0, we
> significantly
> > > > >> enhanced
> > > > >> > it
> > > > >> > > in the coming release.
> > > > >> > > (5) JDK 17 support also signifies another great milestone.
> > > > >> > > (6) Recon UI went through a major revamp and looks very
> > different
> > > > now.
> > > > >> > >
> > > > >> > > On Thu, Aug 22, 2024 at 12:44 PM Wei-Chiu Chuang <
> > > > weic...@apache.org>
> > > > >> > > wrote:
> > > > >> > >
> > > > >> > > > Oh actually I forgot to mention:
> > > > >> > > > What about drop

Re: [VOTE] Apache Ozone 1.4.1 RC1

2024-10-22 Thread Ethan Rose
Hi, any updates on the current 1.4.1 progress? Ratis 3.1.1 should be in
Ozone now that HDDS-11504 
is resolved. I see there’s discussion of doing a Ratis 3.1.2 to fix
RATIS-2149  and RATIS-2172
, but our 1.4.1 release
has already been delayed for a while, so I think we should ship with Ratis
3.1.1 and do a 1.4.2 release with just the patch version of Ratis if
necessary.

I see some new fixes targeting the release like HDDS-11223
 and HDDS-11136
, which is good. What is
the overall status update? Are we ready for the next release candidate?


Ethan

On Wed, Aug 21, 2024 at 12:33 PM Tsz Wo Sze  wrote:

> > (2) Key put fails for large files (> 20GB) due to a memory leak in Ratis
> 3.1.0
> ...
>
> Duong & Wei-chiu,
>
> Thanks for finding this problem!
>
> Agree that we should have a Ratis 3.1.1 release.
> BTW, "Memory leak" usually means that memory was allocated but not
> released; see https://en.wikipedia.org/wiki/Memory_leak . In this case, we
> are not having such a problem. Our problem is unnecessarily using too much
> memory.
>
> Tsz-Wo
>
>
> On Tue, Aug 20, 2024 at 6:20 PM Duong Nguyen 
> wrote:
>
> > I also filed https://issues.apache.org/jira/browse/RATIS-2141 to track
> the
> > memory leak issue.
> >
> > Thanks,
> > Duong
> >
> > On Tue, Aug 20, 2024 at 6:17 PM Duong Nguyen  wrote:
> >
> > > Hi all,
> > >
> > > I just started a thread to discuss releasing Ratis 3.1.1 with the fixes
> > of
> > > the mentioned issues.
> > >
> > > Duong
> > >
> > > On Tue, Aug 20, 2024 at 5:30 PM Uma Maheswara Rao Gangumalla <
> > > umaganguma...@gmail.com> wrote:
> > >
> > >> Hi Wei-Chiu,
> > >>
> > >> Thank you and Duong for the important update on RC1.
> > >>
> > >> @Duong would you be notifying this to Ratis community if they can
> make a
> > >> quick release with just above 2 fixes?
> > >>
> > >> Regards,
> > >> Uma
> > >>
> > >>
> > >> On Tue, Aug 20, 2024 at 4:51 PM Wei-Chiu Chuang 
> > >> wrote:
> > >>
> > >>> Hi thanks for the effort,
> > >>> We are testing the latest Ozone master and Ratis 3.1.0 internally,
> and
> > >>> found a few critical issues.
> > >>>
> > >>> (1) RATIS-2132 
> > which
> > >>> has
> > >>> about 10% performance regression penalty.
> > >>> (2) Key put fails for large files (> 20GB) due to a memory leak in
> > Ratis
> > >>> 3.1.0: it was a haft-done feature of RATIS-1931. DataNode could crash
> > due
> > >>> to out of memory.
> > >>>
> > >>> Both of them can only be fixed in Ratis.
> > >>> I'd suggest to not use Ratis 3.1.0 in Ozone 1.4.1 release.
> > >>>
> > >>> If we can, I'd ask the Ratis community to release Ratis 3.1.1 with
> the
> > >>> above two fixes.
> > >>>
> > >>> cc: @Duong Nguyen  who helped root cause the two
> > >>> issues.
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On Tue, Aug 20, 2024 at 3:31 PM Siyao Meng  wrote:
> > >>>
> > >>> >  +1 (binding)
> > >>> >
> > >>> >
> > >>> >- Verified signatures
> > >>> >- Verified checksums
> > >>> >- Checked ./bin/ozone version output from binary tarball
> > >>> >- Checked ./bin/ozone checknative output from binary tarball
> > >>> >   - rocks_tools_native lib check is missing, filed HDDS-11347
> > >>> >   ,
> > >>> non-blocking.
> > >>> >   - Checked source tarball content matched repo tag
> > ozone-1.4.1-RC1
> > >>> >- Built from source (without native libs support)
> > >>> >- Verified compose/ozone Docker dev cluster boots up correctly
> > with
> > >>> 3
> > >>> >Ozone datanodes.
> > >>> >- Verified basic volume, bucket, key creation and deletion works
> > in
> > >>> >Docker dev cluster.
> > >>> >   - Volume recursive deletion prompt is incorrect, filed
> > HDDS-11346
> > >>> >   ,
> > >>> non-blocking.
> > >>> >
> > >>> >
> > >>> > -Siyao
> > >>> >
> > >>> > On Aug 19, 2024 at 6:39:08 AM, Ayush Saxena 
> > >>> wrote:
> > >>> >
> > >>> > > +1 (Binding), some minor stuff which we should fix in next
> release
> > >>> > >
> > >>> > > * Built from source
> > >>> > > * Verified Checksums
> > >>> > > * Verified Signatures
> > >>> > > * All source files have apache header
> > >>> > > * No code diff b/w the git tag & the contents of src tar
> > >>> > > (dependency-reduced-pom only in src tar, maybe that ain't
> required
> > >>> > > there)
> > >>> > > * Verified the output of ozone version
> > >>> > > * Ran some basic shell commands
> > >>> > > * Checked the NOTICE file: The year is *wrong*, it says 2022, it
> > >>> > > should be 2024 [1], should correct in next release
> > >>> > > * The NOTICE file inside the packaged Jars is *wrong*, It
> mentions
> > >>> > > *Apache Hadoop* & Copyright since 2006, i

Re: Ozone Community Meeting(APAC, 2024 Oct 25th)

2024-10-28 Thread Ethan Rose
Thanks for sharing Sammi. For deletion issues, see HDDS-11506
 and its subtasks.

- Proposed a key deleting service optimization, instead of iterating from
> first key of the table, save last key of last iterator as the start key of
> next iterator, to skip the tombstones of  already deleted record.
>

Are there any benchmarks for how much improvement this has? In general I
feel it is better to keep the Ozone code simple and use RocksDB through its
interface instead of coupling our code to its internal details. Changes
like this can add permanent complexity to the Ozone code and may become
irrelevant as RocksDB upgrades improve performance. If there is
 significant improvement it might be worth exploring though.

- Found one issue that the block delete request exceeds the SCM raft log
> max message size when there are big MPU files involved. Should control the
> block delete request size under the SCM raft log max message size.


This looks like it fits under HDDS-11508


Ethan

On Sun, Oct 27, 2024 at 11:50 PM Sammi Chen  wrote:

> Attenders: Hao, Weiming, Conway, Jianghua, Sammi
>
> Sammi:The new OM HA prototype shows times of improvement of OM throughput.
> Weiming/Weiming/Jianghua:
> - Follower reader feature is used in production environment with a bit of
> customization.
> - Tencent Kona JDK 17 is used with G1 GC to replace the Open JDK 17 on OM,
> which welly solve the OM GC problem triggered by threadLocal usage.
> GuoHao:
> - Found EC pipeline creation lock contention which causes P99 latency
> higher than expected, when there is intensive block allocation requests.
> Will investigate a) EC pipeline pre-allocation pool, and b) code refactor
> to reduce the lock scope,  two solutions next.
> - Proposed a key deleting service optimization, instead of iterating from
> first key of the table, save last key of last iterator as the start key of
> next iterator, to skip the tombstones of  already deleted record.
> - Found one issue that the block delete request exceeds the SCM raft log
> max message size when there are big MPU files involved. Should control the
> block delete request size under the SCM raft log max message size.
>


Re: [VOTE] Apache Ozone 1.4.1 RC2

2024-11-06 Thread Ethan Rose
Yes we should add a step in the guide to update the security file with the
current release when a new one goes out. Thanks Nanda for catching this. I
had been tracking updates required to the guide as part of this release
process locally but hadn't shared anything yet. I just filed HDDS-11654
 so we can keep track of
the updates we should make to the guide when the release goes out.

On Wed, Nov 6, 2024 at 10:22 AM Xi Chen  wrote:

> Hi Nanda
> Thanks for your check and suggestions
>
> Q: The code name for the release is still set to "Hot Springs" which is of
> 1.4.0, should we use a new name for 1.4.1?
>
> A: Currently, Ozone only changes the national park tag when it releases
> the master branch, The minor branch release maintains the national park tag
> unchanged. e.g. Both the national park tags for ozone-1.2 and ozone-1.2.1
> are “Glacier”.
> So ozone-1.4.1 should keep the national park tag unchanged ("Hot Springs"
> ).
>
> Q: 1.4.1 release information is missing in ·, The SECURITY.md file should
> be updated with 1.4.1 details.
>
> A: Currently we don't seem to be updating “SECURITY.md” as part of a new
> release, I can't find any steps for updating “SECURITY.md” in the
> https://ozone-site-v2.staged.apache.org/docs/developer-guide/project/release-guide/
> Last updated SECURITY.md is
> https://issues.apache.org/jira/browse/HDDS-10214
> Do we need to update “SECURITY.md” every time we release a new version? if
> so we should add this step to the Apache Release Manager Guide
>
> Xi Chen
>
>
>
> On 2024/11/05 13:37:37 Nandakumar wrote:
> > Thanx Xi Chen for driving the release.
> >
> > - Verified checksums
> > - Verified signatures
> > - Built from source
> > - Checked docs are included in the release artifact
> > - Branch structure in GitHub looks good
> > - Checked 'ozone version' output
> > * The code name for the release is still set to "Hot Springs"
> > which is of 1.4.0, should we use a new name for 1.4.1?
> > - 1.4.1 release information is missing in SECURITY.md
> > * The SECURITY.md file should be updated with 1.4.1 details.
> >
> > -Nanda
> >
> > On Tue, Nov 5, 2024 at 3:50 PM mrchenx  wrote:
> > >
> > > Dear Ozone Devs,As discussed in the last email, I am calling for a
> vote on Apache Ozone 1.4.1 RC2.
> > > We have released 1.4.0 on Jan 19th. Now there are 223 new commits
> already landed on 1.4.1 branch, Includes Ratis upgrade (upgrade to Ratis
> 3.1.1), some bug fixes, as well as performance optimizations, and some
> necessary dependencies.I am calling for a vote on Apache Ozone 1.4.1
> RC2.   - The RC2 tag can be found on Github at:
> > > - https://github.com/apache/ozone/releases/tag/ozone-1.4.1-RC2
> > >- 223 Jiras were cherry-pick for ozone-1.4.1
> > > -
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20HDDS%20AND%20fixVersion%20%3D%201.4.1
> > >- The source and binary tarballs can be found at:
> > > -  https://dist.apache.org/repos/dist/dev/ozone/1.4.1-rc2/
> > >- Maven artifacts are staged at:
> > > -
> https://repository.apache.org/content/repositories/orgapacheozone-1025
> > >- The public key used to sign the artifacts can be found at:
> > > - https://dist.apache.org/repos/dist/release/ozone/KEYS
> > >- The fingerprint of the key used to sign the artifacts is:
> > > - 0D8C19F5514E2786007936F758C87003FF9A1A38
> > >The vote will run for 7 days, ending on Nov 12th 2024.
> > >
> > > Thanks
> > >
> > > Xi Chen
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> > For additional commands, e-mail: dev-h...@ozone.apache.org
> >
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> For additional commands, e-mail: dev-h...@ozone.apache.org
>
>


Re: [DISCUSS] bump minimum required Java

2024-11-18 Thread Ethan Rose
+1 for keeping java client compatible with java 8 and increasing server
side minimum java version in Ozone 2.0. As for the specific version
requirements for language, build, and runtime on the server side I'm not
sure I have a strong opinion/enough information to weigh in on specifics
right now.

Ethan

On Mon, Nov 18, 2024 at 4:28 PM Tsz Wo Sze  wrote:

> > ... Is there any particular reason we want Java 11?
>
> Just want to be more inclusive.  Requiring a higher Java version
> may exclude more applications.  We could be forcing the dependent projects
> such as HBase to bump their Java version.  Not sure if it is true.
>
> > So for these reasons I think making a client/server distinction in the
> Java version requirement would be better, instead of Recon vs. everything
> else.
>
> That's a good point!  We should require a lower Java version for
> the client-side.  It probably should stay at Java 8.
>
> Tsz-Wo
>
>
> On Mon, Nov 18, 2024 at 12:36 PM Attila Doroszlai 
> wrote:
>
> > > Could it be that only Recon requires Java 21?
> >
> > Yes, technically only Recon (its dependencies) as far as I know.  But:
> >
> > - I think it's easier to manage hosts with uniform Java version
> > - we don't know when some other dependency of OM/SCM/etc. starts
> > requiring newer Java
> >
> > So for these reasons I think making a client/server distinction in the
> > Java version requirement would be better, instead of Recon vs.
> > everything else.
> >
> > -Attila
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> > For additional commands, e-mail: dev-h...@ozone.apache.org
> >
> >
>


Re: [VOTE] Apache Ozone 1.4.1 RC3

2024-11-20 Thread Ethan Rose
+1 (binding)

- Verified checksums
- Verified signatures
- Built from source
- Checked `ozone version` output
- Checked docs are included in the release and can be displayed
- Tested `ozone freon ockrw` in docker

Ethan

On Wed, Nov 20, 2024 at 4:06 PM Siyao Meng  wrote:

>  Good catch Ayush. It looks like -Psrc forgot to include ./tools subdir :D
>
> It needs to be included here:
>
> https://github.com/apache/ozone/blob/2236041f3aa4742917a98e99c54d079e0c243e8d/hadoop-ozone/dist/src/main/assemblies/ozone-src.xml#L78-L102
>
> Thanks,
> Siyao
>
> On Nov 20, 2024 at 12:49:28 PM, Ayush Saxena  wrote:
>
> > +1 (Binding)
> >
> > * Built from source
> > * Validated Checksums
> > * Validated Signatures
> > * Validated LICENSE & NOTICE files
> > * Validated code diff b/w git tag & src tar [1]
> > * Ran some basic shell commands.
> >
> > Thanx Xi Chen for driving the release. Good Luck!!!
> >
> > -Ayush
> >
> > [1]
> > It seems the Fault Injection Service
> > (
> >
> https://github.com/apache/ozone/tree/ozone-1.4.1-RC3/tools/fault-injection-service
> > )
> > is included in the Git tag but not in the source tarball. If I’m not
> > mistaken, this indicates that HDDS-2720 isn't being packaged as part
> > of the source tarball.
> >
> > Does anyone know the reasoning behind this? Notably, it wasn’t
> > included in previous releases either, so I assume this is intentional
> > and not a release blocker.
> >
> > On Thu, 21 Nov 2024 at 01:16, Tsz Wo Sze  wrote:
> >
> >
> > +1
> >
> >
> > - Verified all checksums and signatures.
> >
> >
> > - Checked LICENSE and NOTICE.Found that LICENSE-glyphicons.txt and
> >
> > LICENSE-guava.txt are missing from the ./licenses dir; see the file
> >
> > list at the end.  Should they be copied?  Or, should they be removed
> >
> > from hadoop-ozone/dist?
> >
> >
> > - Compared the files in the src tarball with the files in git.
> >
> >
> > - Built from source successfully.
> >
> >
> > Xi Chen, thanks a lot for working on the release!
> >
> >
> > Tsz-Wo
> >
> >
> >
> >
> ozone-1.4.1-src/hadoop-ozone/dist/src/main/license/src/licenses/LICENSE-angular-nvd3.txt
> >
> >
> >
> >
> ozone-1.4.1-src/hadoop-ozone/dist/src/main/license/src/licenses/LICENSE-angular.txt
> >
> >
> >
> >
> ozone-1.4.1-src/hadoop-ozone/dist/src/main/license/src/licenses/LICENSE-bootstrap.txt
> >
> >
> >
> >
> ozone-1.4.1-src/hadoop-ozone/dist/src/main/license/src/licenses/LICENSE-d3.txt
> >
> >
> >
> >
> ozone-1.4.1-src/hadoop-ozone/dist/src/main/license/src/licenses/LICENSE-glyphicons.txt
> >
> >
> >
> >
> ozone-1.4.1-src/hadoop-ozone/dist/src/main/license/src/licenses/LICENSE-guava.txt
> >
> >
> >
> >
> ozone-1.4.1-src/hadoop-ozone/dist/src/main/license/src/licenses/LICENSE-jquery.txt
> >
> >
> >
> >
> ozone-1.4.1-src/hadoop-ozone/dist/src/main/license/src/licenses/LICENSE-nvd3.txt
> >
> >
> >
> > ozone-1.4.1-src/licenses/LICENSE-angular-nvd3.txt
> >
> >
> > ozone-1.4.1-src/licenses/LICENSE-angular.txt
> >
> >
> > ozone-1.4.1-src/licenses/LICENSE-d3.txt
> >
> >
> > ozone-1.4.1-src/licenses/LICENSE-jquery.txt
> >
> >
> > ozone-1.4.1-src/licenses/LICENSE-nvd3.txt
> >
> >
> > On Wed, Nov 20, 2024 at 7:55 AM wanghongbing 
> >
> > wrote:
> >
> >
> > > +1 for 1.4.1 RC3
> >
> > >
> >
> > > - Build from source.
> >
> > > - Verified the signature and checksums.
> >
> > > - Start cluster on linux os and run simple sh cmds.
> >
> > > - Checked WebUI display.
> >
> > >
> >
> > > Thanks Xi Chen.
> >
> > >
> >
> > > Hongbing
> >
> > >
> >
> > >
> >
> > >
> >
> > >
> >
> > >
> >
> > > > 2024年11月20日 20:49,Yiyang  写道:
> >
> > > >
> >
> > > > +1
> >
> > > >
> >
> > > > * Verified the signature and checksums.
> >
> > > > * Verified tag.
> >
> > > > * Checked docs on UI pages.
> >
> > > > * Run freon ommg command to create keys
> >
> > > > * Verified ozone sh commands.
> >
> > > >
> >
> > > > Sadanand Shenoy  于2024年11月20日周三 20:26写道:
> >
> > > >
> >
> > > >> +1
> >
> > > >>
> >
> > > >>
> >
> > > >>   - Verified signatures.
> >
> > > >>   - Verified checksum.
> >
> > > >>   - Built from source.
> >
> > > >>   - Verified 'ozone version' output of binary.
> >
> > > >>   - Ran basic freon commands.
> >
> > > >>   - Verified Recon UI.
> >
> > > >>
> >
> > > >>
> >
> > > >> Thanks Xi Chen for driving the release.
> >
> > > >>
> >
> > > >> - Sadanand
> >
> > > >>
> >
> > > >> On Wed, Nov 20, 2024 at 4:02 PM Siddhant Sangwan <
> >
> > > >> siddhantsangwan...@gmail.com> wrote:
> >
> > > >>
> >
> > > >>> +1
> >
> > > >>>
> >
> > > >>> Built from source on an arm Macbook with rosetta.
> >
> > > >>>
> >
> > > >>> Wrote Ratis and EC keys using freon.
> >
> > > >>>
> >
> > > >>> Tested some admin commands.
> >
> > > >>>
> >
> > > >>> Xi Chen, thanks for driving this!
> >
> > > >>>
> >
> > > >>> Best,
> >
> > > >>> Siddhant
> >
> > > >>>
> >
> > > >>> On Wed, 20 Nov, 2024, 15:33 Zita Dombi, 
> > wrote:
> >
> > > >>>
> >
> > >  +1 for RC3 (non-binding)
> >
> > > 
> >
> > >  * Built from source
> >
> > >  * Verified signatures and checksums
> 

Re: [VOTE] Apache Ozone 1.4.1 RC2

2024-11-08 Thread Ethan Rose
+1 for moving supported version list to the website and out of the release
artifact. As Attila said, this is a living document so it does not really
fit into immutable releases. We can find a place for it in the old and new
website, but my first guess would be to put it on the downloads page which
already has a table of releases and their links.

Ethan

On Fri, Nov 8, 2024 at 2:36 PM Attila Doroszlai 
wrote:

> > I'm ok with either having this in the SECURITY.md file or tracking this
> in
> > the Ozone website. (We can even do both)
>
> - The list of releases supported is going to change in time, but
> release artifacts are immutable.  How do we invalidate the list
> included in an old release?
> - Multiple release lines may be supported at the same time, and
> releases may be interleaved (1.4.1, 2.0.0, 1.4.2, etc.).  It's not
> clear which list invalidates which other lists.
>
> I think we should simply move it to the website, where we can have a
> single list, and change it with or without making a new release.
>
> -Attila
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> For additional commands, e-mail: dev-h...@ozone.apache.org
>
>


Re: Reschedule Apache Ozone Community Sync to 8 AM PST

2024-11-08 Thread Ethan Rose
Sounds good, thanks Swami. It is good to sync with developers from IST in
the community sync. Can you check if there are calendar invites or other
pieces of documentation referring to the community sync that need to be
updated with this information?

Ethan

On Fri, Nov 8, 2024 at 1:05 PM Abhishek Pal 
wrote:

> This sounds good to me.
> Thanks for the initiative Swami
>
> On Fri, 8 Nov 2024 at 22:54, swaminathan balachandran <
> swamirishi...@gmail.com> wrote:
>
> > Hi,
> >
> > I hope this message finds you well. With the recent change in Daylight
> > Saving Time, our scheduled US community sync at 9 AM PST which is 10:30
> pm
> > IST can become quite late for contributors joining from India. To make it
> > easier for them to participate, I would like to propose moving the sync
> an
> > hour earlier, to 8 AM PST. We can roll back the time once daylight saving
> > starts again on the second Sunday of March.
> >
> > If this works for everyone, we can confirm the change for our upcoming
> > sync.
> >
> > Regards,
> > Swaminathan Balachandran
> >
>


CVE-2024-45106: Apache Ozone: Improper authentication when generating S3 secrets

2024-12-02 Thread Ethan Rose
Severity: moderate

Affected versions:

- Apache Ozone 1.4.0

Description:

Improper authentication of an HTTP endpoint in the S3 Gateway of Apache Ozone 
1.4.0 allows any authenticated Kerberos user to revoke and regenerate the S3 
secrets of any other user. This is only possible if:
  *  ozone.s3g.secret.http.enabled is set to true. The default value of this 
configuration is false.
  *  The user configured in ozone.s3g.kerberos.principal is also configured in 
ozone.s3.administrators or ozone.administrators.


Users are recommended to upgrade to Apache Ozone version 1.4.1 which disables 
the affected endpoint.

This issue is being tracked as HDDS-9203 

Credit:

Ethan Rose (reporter)
Ivan Zlenko (remediation developer)

References:

https://ozone.apache.org
https://www.cve.org/CVERecord?id=CVE-2024-45106
https://issues.apache.org/jira/browse/HDDS-9203


-
To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
For additional commands, e-mail: dev-h...@ozone.apache.org



Re: Reschedule Apache Ozone Community Sync to 8 AM PST

2024-11-14 Thread Ethan Rose
Yes I think that is the place to update it. I think you can go ahead and
make the update and we can start an hour earlier this coming Monday for the
attendees in IST.

On Wed, Nov 13, 2024 at 11:55 AM swaminathan balachandran <
swamirishi...@gmail.com> wrote:

> I believe we would have to update the timings in the following confluence
> page.
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+Community+Calls
> I am not sure if the timing is present somewhere else as well.
>
>
> On Tue, Nov 12, 2024 at 2:31 PM Uma Maheswara Rao Gangumalla <
> umaganguma...@gmail.com> wrote:
>
> > +1
> >
> > Thanks, Swami for bringing this up.
> >
> > Regards,
> > Uma
> >
> > On Fri, Nov 8, 2024 at 3:36 PM Ethan Rose  wrote:
> >
> > > Sounds good, thanks Swami. It is good to sync with developers from IST
> in
> > > the community sync. Can you check if there are calendar invites or
> other
> > > pieces of documentation referring to the community sync that need to be
> > > updated with this information?
> > >
> > > Ethan
> > >
> > > On Fri, Nov 8, 2024 at 1:05 PM Abhishek Pal <
> > > pal.abhishek03012...@gmail.com>
> > > wrote:
> > >
> > > > This sounds good to me.
> > > > Thanks for the initiative Swami
> > > >
> > > > On Fri, 8 Nov 2024 at 22:54, swaminathan balachandran <
> > > > swamirishi...@gmail.com> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I hope this message finds you well. With the recent change in
> > Daylight
> > > > > Saving Time, our scheduled US community sync at 9 AM PST which is
> > 10:30
> > > > pm
> > > > > IST can become quite late for contributors joining from India. To
> > make
> > > it
> > > > > easier for them to participate, I would like to propose moving the
> > sync
> > > > an
> > > > > hour earlier, to 8 AM PST. We can roll back the time once daylight
> > > saving
> > > > > starts again on the second Sunday of March.
> > > > >
> > > > > If this works for everyone, we can confirm the change for our
> > upcoming
> > > > > sync.
> > > > >
> > > > > Regards,
> > > > > Swaminathan Balachandran
> > > > >
> > > >
> > >
> >
>


Re: Ozone 2.0 release update

2025-01-10 Thread Ethan Rose
Thanks for driving this Wei-Chiu. The GitHub discussion
 has references to a lot
of items that were initially proposed to be included in 2.0. Can you add a
comment there with which ones we are still planning to land in this release
and which ones are being moved out?

On Tue, Jan 7, 2025 at 2:53 AM Wei-Chiu Chuang  wrote:

> Getting close. Dashboard
> <
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12337720
> >
>
> We are down to 2 blockers
> . IMO HDDS-11382
>  (Remove the use of
> caniuse-lite ) looks
> like
> a non-issue for me at this point.
>
>1. HDDS-11754 (Drop
>support for non-Ratis OM and SCM
>) has 6 open
> subtasks,
>among which 3 has PRs pending review.
>
>
> I moved most of the new features to 2.1.0. Only 6
>  are still
> targeting Ozone 2.0.0. Once the blocker issues are resolved, I'll move out
> all jiras and fork the master branch to prepare for the release.
>


Re: [VOTE] Apache Ozone 2.0.0 RC0

2025-03-21 Thread Ethan Rose
Thanks for working on this Wei-Chiu. Can you share the planned release
notes for this release to help in the voting process? I'm not entirely sure
which features did or did not make the cut, or what our
compatibility guarantees are with the 1.x line.

Ethan

On Fri, Mar 21, 2025 at 7:02 PM Wei-Chiu Chuang  wrote:

> Correction:
> I appended my PGP key here:
> https://dist.apache.org/repos/dist/release/ozone/KEYS
> I'm not sure which KEYS file is the correct one -- the one under /release/
> or the one under /dev/?
>
> On Fri, Mar 21, 2025 at 3:42 PM Wei-Chiu Chuang 
> wrote:
>
> > Hi Ozone community,
> >
> > Please try out and cast your vote for the Ozone 2.0.0 release candidate
> 0.
> >
> > This is a huge release, containing 1691 resolved jiras, numerous features
> > and stability improvements.
> >
> > Release process:
> >
> https://ozone-site-v2.staged.apache.org/docs/developer-guide/project/release-guide/
> > Git tag: https://github.com/apache/ozone/releases/tag/ozone-2.0.0-RC0
> > All resolved jiras:
> >
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20HDDS%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%202.0.0
> > Source code and binary tarball:
> > https://dist.apache.org/repos/dist/release/ozone/2.0.0-rc0/
> > The Maven artifacts are staged at:
> > https://repository.apache.org/content/repositories/orgapacheozone-1029/
> > PGP key: https://dist.apache.org/repos/dist/dev/ozone/KEYS
> > Fingerprint: 3ED23305D7631918
> >
> > Per Apache policy, this release candidate vote will open for 7 days until
> > the end of March 28th 2025. PMC members can cast binding votes while
> > committers and community contributors are welcomed to cast non-binding
> > votes.
> >
> > Best regards,
> > Weichiu
> >
>


Re: [ASF] A note to projects with GitHub Discussions in use

2025-03-18 Thread Ethan Rose
I don't think GitHub wiki will help us. I would really like all Ozone
content to be published on the new website, which would make our confluence
page obsolete and we can then deprecate/remove it. Having a separate wiki
anywhere creates split brain and is harder to manage. There's no clear
answer for what content would go on one platform vs. the other.

Ethan

On Mon, Mar 17, 2025 at 1:48 PM Wei-Chiu Chuang  wrote:

> FYI
>
> Also do we want to enable GitHub wiki?
> We used to have
> https://cwiki.apache.org/confluence/display/OZONE/Ozone+Wiki+Home
> but for some reason its SEO is so bad, I couldn't google it. It's probably
> time to migrate stuff from the wiki to Ozone's user doc site or GitHub
> wiki.
>
> -- Forwarded message -
> From: Daniel Gruno 
> Date: Mon, Mar 17, 2025 at 10:37 AM
> Subject: [ASF] A note to projects with GitHub Discussions in use
> To: 
>
>
> Projects that are using GitHub Discussions and have suddenly found that
> feature to be missing, please see
>
> https://github.com/apache/infrastructure-asfyaml?tab=readme-ov-file#repo_features
>
> GitHub Discussions are now managed entirely through .asf.yaml, and will
> default to being disabled unless enabled in your .asf.yaml file, OR if
> you do not have a valid 'discussions' entry in your notifications
> section in the configuration.
>
>
> The TL;DR for enabling Discussions in .asf.yaml is to weave these
> settings into your .asf.yaml file in your default branch:
>
>
> notifications:
>discussions: iss...@foo.apache.org  (use your own mailing list here)
> ...
> github:
>features:
>  discussions: true
>


Re: [VOTE] Apache Ozone 2.0.0 RC1

2025-04-05 Thread Ethan Rose
Have we run this change through the upgrade/downgrade acceptance tests yet?
It would be good to know:
1. If it works with downgrade (only affects network related protos)
2. If we are supporting downgrade from the 2.x to 1.x line.
There is a lot of room for improvement on the current release
notes/changelog but one thing I cannot find that should be called out right
at the top is what this major version increase means for client/server and
upgrade/downgrade compatibility.

On Sun, Mar 30, 2025 at 5:04 PM Tsz-Wo Nicholas Sze 
wrote:

> Hi Wei-Chiu,
>
> (1) A reason for not yet being able to merge the PR is the protolock file.
> We are not sure how and when to update it for such a change.  The "Build
> and commit the proto.lock change" section in our release guideline [2] does
> not mention it.
>
> (2) For future incompatible changes, we must wait for the next major
> version.  Anyway, this PR definitely is a good improvement.
>
> (3) If we are not doing it now, I guess it may be hard to get it in a
> maintenance release such as 2.0.1.  This kind of change is usually
> unwelcome in a maintenance release.
>
> XiChen, not sure if you could work on HDDS-11351 in a timely manner.   If
> not, I could continue the work.  Since we already know what to do, we
> should be able to merge the change within a week.
>
> Tsz-Wo
> [2]
>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=173085374#OzoneReleaseGuideline-Buildandcommittheproto.lockchange
>
>
>
> On Sun, Mar 30, 2025 at 1:12 PM Wei-Chiu Chuang 
> wrote:
>
> > Thanks for bringing it up!
> >
> > I'm not against including it but
> >
> > (1) the PR has stalled for a few months now. Do we think it can be done
> > soon?
> > (2) there's no guarantee the PR will be the final version. What if later
> on
> > we realize we need to change protobuf again?
> > (3) if it's a compatible change, it doesn't need to be in 2.0.0.
> >
> > On Sat, Mar 29, 2025 at 12:23 PM Tsz-Wo Nicholas Sze  >
> > wrote:
> >
> > > More info on HDDS-11351:
> > >
> > > TLDR: The change is wire compatible but requires updating protolock.
> > >
> > > XiChen pointed out that both hdds.proto and
> > > ScmServerDatanodeHeartbeatProtocol.proto have the same proto package
> > > "hadoop.hdds".  So, we could safely move StorageTypeProto from
> > > ScmServerDatanodeHeartbeatProtocol.proto to hdds.proto. The only
> > difference
> > > is the java_outer_classname. Fortunately,
> > > ScmServerDatanodeHeartbeatProtocol.proto is a non-user facing internal
> > > protocol and the change is wire compatible.  (It is API incompatible
> but
> > it
> > > is fine since the protocol is not a public API.)  So there are no
> > > compatibility issues.
> > >
> > > A problem is that we need to update the protolock file.  If we are
> going
> > to
> > > do it, let's also rename ScmServerDatanodeHeartbeatProtocol.proto to
> > > StorageContainerDatanodeProtocol.proto, i.e. make it consistent with
> its
> > > java_outer_classname.
> > >
> > > Tsz-Wo
> > > [1] https://github.com/apache/ozone/pull/7109#discussion_r2008750162
> > >
> > >
> > > On Sat, Mar 29, 2025 at 10:04 AM Tsz-Wo Nicholas Sze <
> szets...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi Ozone dev,
> > > >
> > > > HDDS-11351 is unifying the protobuf definition of StorageType.  How
> > about
> > > > we get it in 2.0.0?
> > > >
> > > > Sorry that I came with this idea late.
> > > >
> > > > Tsz-Wo
> > > >
> > > >
> > > >
> > > > On Wed, Mar 26, 2025 at 11:05 PM Wei-Chiu Chuang  >
> > > > wrote:
> > > >
> > > >> By the way, build environment:
> > > >>
> > > >> x86, Amazon Linux, OpenJDK8, Maven 3.9.9, gcc 11
> > > >>
> > > >> build parameters:
> > > >> mvn clean install -Dmaven.javadoc.skip=true -DskipTests
> > -Psign,dist,src
> > > >> -Dtar -Dgpg.keyname="$CODESIGNINGKEY" -Drocks_tools_native
> > > >>
> > > >> # cat /etc/amazon-linux-release
> > > >> Amazon Linux release 2023.6.20250317 (Amazon Linux)
> > > >>
> > > >> # java -version
> > > >> openjdk version "1.8.0_442"
> > > >> OpenJDK Runtime Environment Corretto-8.442.06.1 (build
> 1.8.0_442-b06)
> > > >> OpenJDK 64-Bit Server VM Corretto-8.442.06.1 (build 25.442-b06,
> mixed
> > > >> mode)
> > > >>
> > > >> # mvn -v
> > > >> Apache Maven 3.9.9 (8e8579a9e76f7d015ee5ec7bfcdc97d260186937)
> > > >> Maven home: /root/apache-maven-3.9.9
> > > >> Java version: 1.8.0_442, vendor: Amazon.com Inc., runtime:
> > > >> /usr/lib/jvm/java-1.8.0-amazon-corretto.x86_64/jre
> > > >> Default locale: en, platform encoding: UTF-8
> > > >> OS name: "linux", version: "6.1.130-139.222.amzn2023.x86_64", arch:
> > > >> "amd64", family: "unix"
> > > >>
> > > >> # g++ -v
> > > >> Using built-in specs.
> > > >> COLLECT_GCC=g++
> > > >>
> > COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-amazon-linux/11/lto-wrapper
> > > >> OFFLOAD_TARGET_NAMES=nvptx-none
> > > >> OFFLOAD_TARGET_DEFAULT=1
> > > >> Target: x86_64-amazon-linux
> > > >> Configured with: ../configure --enable-bootstrap --enable-host-pie
> > > >> --enable-host-bind-now --enabl

Re: [VOTE] Merge branch HDDS-10239-container-reconciliation into master

2025-06-18 Thread Ethan Rose
avadoc for all the maps using Long.  Describe what it is for (e.g.
> > block id).
> > - When converting a checksum (long) to a string, use hexadecimal.
> > - Avoid Optional, which is slow and generates garbage.
> > - Fix conflicts with the master.
> > - Use proto 3 for new protos (but it may be hard to do.)
> >
> > Tsz-Wo
> >
> >
> > On Tue, Jun 10, 2025 at 5:57 PM Ethan Rose  wrote:
> >
> >> Based on discussion in the community sync this week I want to add some
> >> more
> >> information.
> >> Code
> >>
> >> For those interested in checking out the code, these are some of the
> major
> >> classes to start with:
> >>
> >>- ReconcileContainerTask: This is the command on the datanode that is
> >>received from SCM to reconcile a container with a datanode’s peers.
> It
> >>passes through the ReplicationSupervisor just like replication and
> >>reconstruction commands.
> >>- ContainerProtos.ContainerChecksumInfo: This is the proto format of
> >> the
> >>new file that is written into the containers with the merkle tree and
> >> list
> >>of deleted blocks.
> >>- ContainerMerkleTreeWriter: This class is used to build merkle trees
> >>chunk by chunk and generate a protobuf representation of the tree.
> >>- ContainerChecksumTreeManager: This class coordinates reads and
> writes
> >>of ContainerChecksumInfo for containers. The diff method determines
> >>which repairs should be done on a container based on a peer’s merkle
> >> tree.
> >>- KeyValueContainerCheck#scanData: This is the existing method called
> >> by
> >>the background and on-demand container data scanners to scan a
> >> container.
> >>It has been updated to build the merkle tree as it runs.
> >>- KeyValueHandler#reconcileContainer: This method updates the
> container
> >>based on the peer’s replica.
> >>- Major tests for reconciliation have been added to
> >>TestContainerCommandReconciliation (integration test) and
> >>TestContainerReconciliationWithMockDatanodes (unit test with mocked
> >>clients).
> >>   - There are more tasks under the reconciliation jira to expand the
> >>   types of faults being tested.
> >>
> >> Logging
> >>
> >> Logging was added on the datanodes to track reconciliation as it is
> >> happening. The datanode application log will print a summary of messages
> >> like this:
> >>
> >> 2025-06-10 20:13:14,570 [main] INFO  keyvalue.KeyValueHandler
> >> (KeyValueHandler.java:reconcileContainer(1595)) - Beginning
> >> reconciliation for container 100 with peer
> >> bbc09073-ac0d-4b2f-afe4-1de5f9dc6f43(dn3/237.6.76.4). Current data
> >> checksum is dcce847d
> >> 2025-06-10 20:13:14,589 [main] WARN  keyvalue.KeyValueHandler
> >> (KeyValueHandler.java:reconcileContainer(1681)) - Container 100
> >> reconciled with peer
> >> bbc09073-ac0d-4b2f-afe4-1de5f9dc6f43(dn3/237.6.76.4). Data checksum
> >> updated from dcce847d to 16189e0b.
> >> Missing blocks repaired: 5/5
> >> Missing chunks repaired: 0/0
> >> Corrupt chunks repaired:  10/10
> >> Time taken: 19 ms
> >> 2025-06-10 20:13:14,589 [main] WARN  keyvalue.KeyValueHandler
> >> (KeyValueHandler.java:reconcileContainer(1704)) - Completed
> >> reconciliation for container 100 with 1/1 peers. 15 blocks were
> >> updated. Data checksum updated from dcce847d to 16189e0b
> >>
> >> This shows:
> >>
> >>- Reconciliation started between this datanode and one other peer for
> >>container 100
> >>- After reconciliation with the peer completed, the data checksum of
> >> our
> >>container was updated
> >>- Compared to this peer, we needed to ingest 5 missing blocks and
> >> repair
> >>10 corrupt chunks. All operations were successful
> >>- At the end we get a summary of how many changes were done to this
> >>container after consulting all the peers in the reconcile request. In
> >> this
> >>case there was only one peer.
> >>By enabling debug logging we can see the individual blocks and chunks
> >>that were repaired as well.
> >>
> >> In the dn-container.log file, dataChecksum is now included for every log
> >> line. We also get one new line in this log every time the checksum for a
> >

Re: [VOTE] Merge branch HDDS-10239-container-reconciliation into master

2025-06-10 Thread Ethan Rose
Based on discussion in the community sync this week I want to add some more
information.
Code

For those interested in checking out the code, these are some of the major
classes to start with:

   - ReconcileContainerTask: This is the command on the datanode that is
   received from SCM to reconcile a container with a datanode’s peers. It
   passes through the ReplicationSupervisor just like replication and
   reconstruction commands.
   - ContainerProtos.ContainerChecksumInfo: This is the proto format of the
   new file that is written into the containers with the merkle tree and list
   of deleted blocks.
   - ContainerMerkleTreeWriter: This class is used to build merkle trees
   chunk by chunk and generate a protobuf representation of the tree.
   - ContainerChecksumTreeManager: This class coordinates reads and writes
   of ContainerChecksumInfo for containers. The diff method determines
   which repairs should be done on a container based on a peer’s merkle tree.
   - KeyValueContainerCheck#scanData: This is the existing method called by
   the background and on-demand container data scanners to scan a container.
   It has been updated to build the merkle tree as it runs.
   - KeyValueHandler#reconcileContainer: This method updates the container
   based on the peer’s replica.
   - Major tests for reconciliation have been added to
   TestContainerCommandReconciliation (integration test) and
   TestContainerReconciliationWithMockDatanodes (unit test with mocked
   clients).
  - There are more tasks under the reconciliation jira to expand the
  types of faults being tested.

Logging

Logging was added on the datanodes to track reconciliation as it is
happening. The datanode application log will print a summary of messages
like this:

2025-06-10 20:13:14,570 [main] INFO  keyvalue.KeyValueHandler
(KeyValueHandler.java:reconcileContainer(1595)) - Beginning
reconciliation for container 100 with peer
bbc09073-ac0d-4b2f-afe4-1de5f9dc6f43(dn3/237.6.76.4). Current data
checksum is dcce847d
2025-06-10 20:13:14,589 [main] WARN  keyvalue.KeyValueHandler
(KeyValueHandler.java:reconcileContainer(1681)) - Container 100
reconciled with peer
bbc09073-ac0d-4b2f-afe4-1de5f9dc6f43(dn3/237.6.76.4). Data checksum
updated from dcce847d to 16189e0b.
Missing blocks repaired: 5/5
Missing chunks repaired: 0/0
Corrupt chunks repaired:  10/10
Time taken: 19 ms
2025-06-10 20:13:14,589 [main] WARN  keyvalue.KeyValueHandler
(KeyValueHandler.java:reconcileContainer(1704)) - Completed
reconciliation for container 100 with 1/1 peers. 15 blocks were
updated. Data checksum updated from dcce847d to 16189e0b

This shows:

   - Reconciliation started between this datanode and one other peer for
   container 100
   - After reconciliation with the peer completed, the data checksum of our
   container was updated
   - Compared to this peer, we needed to ingest 5 missing blocks and repair
   10 corrupt chunks. All operations were successful
   - At the end we get a summary of how many changes were done to this
   container after consulting all the peers in the reconcile request. In this
   case there was only one peer.
   By enabling debug logging we can see the individual blocks and chunks
   that were repaired as well.

In the dn-container.log file, dataChecksum is now included for every log
line. We also get one new line in this log every time the checksum for a
container is updated.

In case logs roll off, a debug tool to inspect container’s checksum
information locally on a datanode will be implemented in HDDS-13239
.
Metrics

The metrics for reconciliation tasks are available as a part of
ReplicationSupervisor class which includes:

   - numRequestedContainerReconciliations - Number of reconciliation tasks
   - numQueuedContainerReconciliations - Number of queued tasks
   - numTimeoutContainerReconciliations - Number of timed-out tasks
   - numSuccessContainerReconciliations- Number of Success
   - numFailureContainerReconciliations - Number of Failures
   - numSkippedContainerReconciliations - Number of Skipped Tasks

Latency/Count metrics for the tasks exposed by CommandHandlerMetrics for
ReconcileContainerCommandHandler:

   - TotalRunTimeMs - The total runtime of the command handler in
   milliseconds
   - AvgRunTimeMs - Average run time of the command handler in milliseconds
   - QueueWaitingTaskCount - The number of queued tasks waiting for
   execution
   - InvocationCount - The number of times the command handler has been
   invoked
   - CommandReceivedCount - The number of received SCM commands for each
   command type

Other container reconciliation-related tasks are encapsulated in
ContainerMerkleTreeMetrics:

   - numMerkleTreeWriteFailure - Number of Merkle tree write failure
   - numMerkleTreeReadFailure - Number of Merkle tree read failure
   - numMerkleTreeDiffFailure - Number of Merkle tree diff failure
   - numNoRepairContainerDiff - Number of container diff that do

Re: [DISCUSS] Replace goofys

2025-07-07 Thread Ethan Rose
+1 for finding a goofys replacement. All of s3fs-fuse, mountpoint-s3, and
juicefs are probably worth testing since they seem to have different use
cases. mountpoint-s3 and s3fs are most similar to goofys in that the
namespace exposed matches what is stored in Ozone but some posix operations
are not supported. My understanding is that s3fs-fuse is more posix than
mountpoint-s3 at the cost of performance. Juicefs is fully posix but the
tradeoff is it stores metadata in a separate DB, so data written through
juicefs is not usable with just core Ozone.

Ethan

On Mon, Jul 7, 2025 at 5:23 AM Sammi Chen  wrote:

> +1, goofys is not stable after hours of running, especially in the
> continuous file write case.
> We'd better benchmarking these new candidates.
>
> BTW, I have heard of one JuiceFS + Ozone case, which seems running well.
>
> Sammi
>
> On Sun, 6 Jul 2025 at 15:29, Tsz-Wo Nicholas Sze 
> wrote:
>
> > +1  If goofys is unmaintained, we should find a replacement.
> >
> > Tsz-Wo
> >
> > On Tue, Jul 1, 2025 at 11:46 AM Wei-Chiu Chuang 
> > wrote:
> >
> > >1. HDDS-13356 
> > >
> > > Goofys is a s3 fuse client. We rely on goofys to mount s3 locally and
> our
> > > ozone csi driver depends on goofys. Ozone-docker-runner includes
> goofys:
> > >
> > > However, goofys looks unmaintained now. No new releases in 4 years.
> > >
> > > In the meantime, there are better S3 fuse clients: for example,
> > > https://github.com/awslabs/mountpoint-s3, JuiceFS:
> > https://juicefs.com/en/
> > > s3
> > > fuse: https://github.com/s3fs-fuse/s3fs-fuse
> > >
> > > We should consider replacing goofys with either option. Thoughts?
> > >
> >
>


Re: [VOTE] Merge branch HDDS-10239-container-reconciliation into master

2025-06-26 Thread Ethan Rose
Thanks for the votes everyone. Based on the results (9 +1s, no -1s, no 0s)
the reconciliation feature branch has been merged into master. To address
minor items that were suggested to fix after the merge:

   - HDDS-13304 <https://issues.apache.org/jira/browse/HDDS-13304> has been
   filed to check how the size of the tree file changes with the layout of
   data within the container (many small blocks vs fewer large blocks)
   - HDDS-13305 <https://issues.apache.org/jira/browse/HDDS-13305> has been
   filed to create an object to wrap container checksums
  - Among other things will give us a toString method to make sure all
  checksums are rendered as hex in logs and output so we don’t need to
  manually invoke checksumToString.
   - HDDS-13340 <https://issues.apache.org/jira/browse/HDDS-13340> /
   pull/8705 <https://github.com/apache/ozone/pull/8705> Is addressing the
   following concerns:
  - Add javadoc to methods in ContainerDiffReport
  - Implement metrics for types of repairs done. This was left in an
  old TODO comment but never implemented in the corresponding Jira.
  - In the test-only constructor of BlockDeletingService, have each
  test pass their own ChecksumTreeManager so there is no confusion
  about which instance the class is using.
   - For using ContainerID in datanodes, we have HDDS-12769
   <https://issues.apache.org/jira/browse/HDDS-12769> tracking this.
  - Based on the current progress I have not identified any issues
  where the reconciliation branch merge has undone any work already under
  this epic. We can file more Jiras under this task if this is not the case
  though.
   - HDDS-13341 <https://issues.apache.org/jira/browse/HDDS-13341> has been
   filed to rename ScanResult#isHealthy to ScanResult#hasErrors.

Thanks,
Ethan

On Wed, Jun 18, 2025 at 1:37 PM Tsz-Wo Nicholas Sze 
wrote:

> Hi Ethan,
>
> Thanks for addressing my comments!
>
> > ... Datanode code consistently identified container IDs as long values.
> We can shift to using ContainerID in the datanode as well, but that would
> be a
>
>1. change outside of reconciliation.
>
>We have an umbrella JIRA HDDS-12769
><https://issues.apache.org/jira/browse/HDDS-12769>"Use ContainerID
>instead of Long in datanode".  For the new code/new data structures,
> please
>use ContainerID unless it is hard to do.
>
>
> Look forward to merging the branch.
>
> Tsz-Wo
>
> On Wed, Jun 18, 2025 at 7:47 AM Ethan Rose  wrote:
>
> > Thanks for reviewing the code Nicholas.
> >
> > For the “healthy” term, I agree it is not specific enough. In this PR we
> > are updating it to checksumMatches in the merkle tree proto. We can
> update
> > the term in the ScanResult in a follow-up since it is not touching the
> > proto. Cases where the block is missing or unreadable will not be able to
> > generate a checksum and chunk will be considered missing from the merkle
> > tree and need repair anyways, so other states like IO_EXCEPTION shouldn’t
> > be necessary.
> >
> > The BlockDeletingService is supposed to share the same
> > ContainerChecksumTreeManager as the rest of the code and it does as far
> as
> > I can tell. If you are referring to this line
> > <
> >
> https://github.com/apache/ozone/blob/2b4708b70e5c9de133381a38ca7fa4b3cf3caa42/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/BlockDeletingService.java#L76
> > >,
> > that is a test-only constructor.
> >
> > ContainerID and its corresponding proto are only used in SCM code right
> > now. Datanode code consistently identified container IDs as long values.
> We
> > can shift to using ContainerID in the datanode as well, but that would
> be a
> > change outside of reconciliation.
> >
> > For maps with Long keys, I’ve identified a few instances where we can
> > document it better:
> >
> >- Return values from ContainerDiffReport
> >- this field
> ><
> >
> https://github.com/apache/ozone/blob/a355664093c634c3d04d0601b5c0302260a44c6c/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/ContainerMerkleTreeWriter.java#L46
> > >
> >in ContainerMerkleTreeWriter
> >- Did you have other instances in mind?
> >
> > Checksum longs should be converted to hex strings in all log and json
> > output using HddsUtils#checksumToString
> > <
> >
> https://github.com/apache/ozone/blob/2b4708b70e5c9de133381a38ca7fa4b3cf3caa42/hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/HddsUtils.java#L865
> > >.
> > If there’s spots we missed let us know

Re: Do not create empty jira issues

2025-07-31 Thread Ethan Rose
+1 for mandatory description field.

I can see the pros and cons of having a template. On one hand it helps
guide users with what content is important to add to the description. On
the other hand, it can be left unmodified. I guess even a mandatory jira
description field with no template has a similar problem though,
because sometimes people just re-type the title of the Jira into the
description.

Ethan

On Thu, Jul 31, 2025 at 3:29 AM Wei-Chiu Chuang  wrote:

> I mean, especially for improvements and new features, what is changed for
> users.
>
> For example, adding a new API, adding a new metric, and new configuration
> properties.
>
> not sure what is the right place to put it, and I understand it's a burden
> for developers, but it is so important. I'd like to ask all of us to start
> thinking like a user: can someone understand this change without looking at
> the source code?
>
> I'd love to see a new convention born out of this discussion. It would be
> too much to ask everyone to follow it voluntarily, so some enforcement in
> the jira itself would be needed.
>
> It's wee hours here so sorry if I'm not making sense.
>
> On Wed, Jul 30, 2025 at 10:57 PM Attila Doroszlai 
> wrote:
>
> > > We can file an INFRA ticket to require a description field when filing
> a
> > > jira issue. If no one objects I can open an INFRA jira.
> >
> > +1 for mandatory description.
> >
> > > I'd actually want to go one step further and to have a pre-filled
> > template
> >
> > -1 for pre-filled template.  It defeats the purpose of making the
> > field mandatory.  Some folks will create issues without editing the
> > template.  We see this for PRs, too.
> >
> > -Attila
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> > For additional commands, e-mail: dev-h...@ozone.apache.org
> >
> >
>