Hi Sammi and others,

Thanks for working on the Disk Balancer feature!  It is going to be very
useful.

Some comments:

- Why does SCM need a DiskBalancerManager?  Should the DiskBalancerService
just be a local service in Datanode?

- In DiskBalancerTask, it holds the container read lock.  Would it be
possible that a file gets deleted when another thread is reading it and
results in an exception?

- In DefaultContainerChoosingPolicy, it has a CACHE for caching container
iterators.  Would it cause ConcurrentModificationException if the
underlying containerMap is modified?

- In DefaultVolumeChoosingPolicy.chooseVolume(..), getCurrentUsage() is
called multiple times so it can return different values.

- Is DiskBalancerService.DISK_BALANCER_TMP_DIR the same as
StorageVolume.TMP_DIR_NAME?  If yes, use StorageVolume.TMP_DIR_NAME instead
of adding DiskBalancerService.DISK_BALANCER_TMP_DIR.

- Do not throw RuntimeException -- how to handle it when it happens?

- Since this is a new feature, how about not supporting SCHEMA_V1/V2 ?  If
we want to support them, we should add some tests.

- DatanodeDiskBalancerInfoProto should use DatanodeIDProto instead of
DatanodeDetailsProto

- Remove DatanodeDiskBalancerInfoType since it is never used.

- Remove the new Container.importContainerData(Path containerPath) method
since it is never used.

- Rename Container.copyContainerData to copyContainerDirectory.  Otherwise,
it sounds like copy/import the ContainerData class.

- In DiskBalancerService, use ContainerID instead of Long for container id.

- Do not use Optional for parameters; see
https://www.reddit.com/r/java/comments/sat1j4/opinions_on_using_optional_as_parameter/

- Use Objects.requireNonNull(..) instead of Preconditions.checkNotNull(..).

Regards,
Tsz-Wo


On Fri, Aug 8, 2025 at 9:57 AM Ethan Rose <er...@apache.org> wrote:

> Hi Sammi, thanks for working on this feature.
>
> There should not be any incompatible changes introduced with this feature.
>
> This section needs more information because it only talks about a feature
> flag, which is orthogonal to any upgrade/downgrade concerns because feature
> flags do not prevent downgrades like finalization does.
>
>    - What happens if the feature flag is enabled, and later disabled?
>    - If an in-progress container move is interrupted by a downgrade, what
>    does the old version do?
>    - Are components persisting any state about the ongoing operations, and
>    if so, what happens to that during a downgrade then upgrade flow?
>
> The disk balancer has a feature flag that currently defaults to false. In
> this
> comment <https://github.com/apache/ozone/pull/8869#issuecomment-3130009094
> >
> I explained why feature flags for features that are only triggered by
> admins are redundant. I’ve heard other devs from HDFS also express concern
> over everything in that project being behind feature flags and trying to
> avoid sending Ozone down that path as well. IMO if we want to indicate that
> this is not ready for prod use, it would be better to hide the CLI command
> <https://picocli.info/quick-guide.html#_hidden_options_and_parameters> and
> add a disclaimer to the help message indicating this. The current config
> flag approach provides no user facing information as to why it is disabled
> by default or what the implications of enabling it are. Additionally, why
> is this set tofalse by default but considered ready to merge with master?
>
> Ethan
>
> On Fri, Aug 8, 2025 at 4:53 AM Sammi Chen <sammic...@apache.org> wrote:
>
> > Hi Ozone developers,
> >
> > I would like to propose merging the HDDS-5713 Disk Balancer feature
> branch
> > into master.
> >
> > HDDS-5713 adds the support of the disk volume utilization balancer
> function
> > by selecting and moving containers from high utilized data volume to
> lower
> > utilized data volume, to achieve an all disk volumes' utilization even
> > state.
> >
> >
> > Feature Jira Link:
> > https://issues.apache.org/jira/browse/HDDS-5713
> >
> >
> > Checklist for feature branch merge:
> >
> >
> https://cwiki.apache.org/confluence/display/OZONE/Disk+Balancer+For+Datanode+-+HDDS-5713
> > <
> >
> https://cwiki.apache.org/confluence/display/OZONE/Disk+Balancer+For+Datanode+-+HDDS-5713
> > >
> >
> >
> > This vote will be open for at least a week.
> >
> > Thanks,
> > Sammi Chen
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org
> > For additional commands, e-mail: dev-h...@ozone.apache.org
> >
>

Reply via email to