Hi All,

Few feedback from my side over design:
1. SCM act as proxy for CLI command like start / stop / config changes but
it has issue due to SCM async nature (API result consistency),
  a. Command results for successful apply or failure of change can not be
shown in *expected* *time* (for a happy path will take at least 2 HB time,
i.e. 1 min)
  b. Command rejection failure is high, based on SCM state (especially for
all DN cases) like DN state un-healthy or not available or restarted, SCM
restart, ...
IMO, we can remove this CLI command execution via SCM but can be direct
from Tool CLI itself as its redundant job which CLI also can do efficiently.

2. The DN side having a file as persistence of configuration is complex and
error prone. DB can be used instead to keep configuration as there for open
containers.
  a. file format versioning code for upgrade -- having additional code
  b. file consistency, recovery on failure and atomic write -- additional
code is there
  c. file path configuration
  d. amount of code is high 2-3kloc for this which can be handled in a few
100s.
  So I think we should replace the file with DB for keeping configuration
in an atomic way which DB has inbuilt features and *proto* provides
implicit *upgrade capability.*

IMO, we need to make the above 2 changes with respect to code
implementation. Others can share opinions.

Regards
Sumit

On Fri, Aug 22, 2025 at 9:33 PM Tsz-Wo Nicholas Sze <[email protected]>
wrote:

> > The SlidingWindow class is not related to disk balancer, ...
>
> Ethan, thanks for pointing it out!  Then, we should not remove it here.
>
> Tsz-Wo
>
>
> On Thu, Aug 21, 2025 at 1:18 PM Ethan Rose <[email protected]> wrote:
>
> > Thanks for working to update this for merge.
> >
> > The SlidingWindow class is not related to disk balancer, it has been
> added
> > to master to be used by the volume scanner. It was added in HDDS-12852
> > <https://issues.apache.org/jira/browse/HDDS-12852> and is being used in
> > HDDS-13108 <https://issues.apache.org/jira/browse/HDDS-13108> /
> > https://github.com/apache/ozone/pull/8843. Please don't remove it.
> >
> > Please update the wiki with the justification for why there are no
> > incompatible changes as described in this thread. Currently it still
> > says "There
> > should not be any incompatible changes introduced with this feature." but
> > provides no justification.
> >
> > Regarding the feature flag, the status communicated in this thread must
> > also be clear to end users. Until this is considered stable and enabled
> by
> > default, please:
> > - Update the user doc added in https://github.com/apache/ozone/pull/8837
> > to
> > indicate that the feature is considered experimental/beta. Additionally,
> I
> > don't see a mention of the feature flag in this doc.
> > - Update the config description added in
> > https://github.com/apache/ozone/pull/8869 to indicate the same.
> > - Update the CLI help message to indicate the same.
> > - Hide the CLI so it does not appear in help messages.
> >
> > Additionally, please make all the CLI flags kebab-case. They are using a
> > mix of kebab and camel case right now. Ideally the command itself would
> be
> > `disk-balancer` as well, but since we already have `containerbalancer` we
> > might want to leave that part as is for consistency.
> >
> > Thanks,
> > Ethan
> >
>


-- 
*Sumit Agrawal* | Senior Staff Engineer
cloudera.com <https://www.cloudera.com>
[image: Cloudera] <https://www.cloudera.com/>
[image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera
on LinkedIn] <https://www.linkedin.com/company/cloudera>
------------------------------

Reply via email to