Thanks, Wei-Chiu for the explanation. In that case, I give my +1 to the
proposed change.

73
Kihwal

On Tue, May 12, 2020 at 10:31 AM Wei-Chiu Chuang <weic...@apache.org> wrote:

> I don't think we have I/O-based balancing. That would surely make a great
> research project but it doesn't seem trivial to me.
>
> Also worth noting the implementation doesn't try to achieve very fine
> grained balance in space. As long as all volumes has the available space
> within a threshold (10GB by default),
> it falls back to round-robin policy.
>
> On Tue, May 5, 2020 at 9:23 AM Kihwal Lee <kih...@verizonmedia.com
> .invalid>
> wrote:
>
> > Successfully running on 1,000 clusters over 5 years proves the feature is
> > stable.  It does not, however, give me assurance that it will perform
> well
> > in our env.
> >
> > It will be nice if there is some data on its performance. On obvious
> > concern is, running into grossly unbalanced I/O load among drives. Since
> > our multi-tenant clusters have a high utilization for both CPU and IO,
> > ganging up on a drive tends to hurt job throughput and cause SLA misses.
> >
> > I would feel more comfortable if the feature takes I/O balancing into
> > consideration at the same time.  Sorry, I didn't look at the code, so if
> it
> > is already doing this, that's good news.
> >
> > Thanks
> > Kihwal
> >
> > On Thu, Apr 30, 2020 at 4:34 AM Stephen O'Donnell
> > <sodonn...@cloudera.com.invalid> wrote:
> >
> > > I am hoping Arpit Agarwal & Tsz-wo-Sze will comment here too, but I
> will
> > > ping them directly if they do not.
> > >
> > > 5 years ago, when they raised those concerns, the feature was new and
> > > little used. Their concerns, I think, were based on a theory that the
> > > feature might not perform well. However since then the feature has
> proven
> > > stable and trouble free. In supporting many of Cloudera's clusters over
> > the
> > > last 5 years and despite us having about 1000 clusters using this
> > setting,
> > > I don't recall a single issue caused by it. On the other hand, we
> > fielded a
> > > lot of support issues around default round robin policy, where smaller
> > > disks filled up, needing to run the disk balancer etc.
> > >
> > > As the feature seems to work well in practice, I would be inclined to
> > leave
> > > what appears to be stable as it is, and only make changes if we see
> > issues
> > > in real usage.
> > >
> > > On Thu, Apr 30, 2020 at 10:03 AM Ayush Saxena <ayush...@gmail.com>
> > wrote:
> > >
> > > > Hey Stephen,
> > > > Thanx for initiating this.
> > > > Just had a look on HDFS-8538, Seems it had concerns couple of
> concerns
> > > > regarding the write throughput and performance by Arpit Agarwal &
> > > > Tsz-wo-Sze. It concluded with a solution in the end as mentioned
> here :
> > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/browse/HDFS-8538?focusedCommentId=14606094&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14606094
> > > >
> > > > Do you plan to incorporate the same and then continue or is the
> concern
> > > > then raised isn't there now? Any pointers on those concerns and
> > comments?
> > > > Would be great if you get a nod from them too..
> > > >
> > > > Thanx
> > > > -Ayush
> > > >
> > > > On Thu, 30 Apr 2020 at 09:32, Akira Ajisaka <aajis...@apache.org>
> > wrote:
> > > >
> > > >> +1 to change the default policy in Hadoop 3.4+.
> > > >>
> > > >> -Akira
> > > >>
> > > >> On Wed, Apr 29, 2020 at 1:28 AM Clay Baenziger (BLOOMBERG/ 919 3RD
> A)
> > <
> > > >> cbaenzi...@bloomberg.net> wrote:
> > > >>
> > > >> > I can confirm that my group has run with Available Space for a
> > number
> > > of
> > > >> > years on the 2.7.x line quite successfully.
> > > >> >
> > > >> > -Clay
> > > >> >
> > > >> > From: weic...@cloudera.com.INVALID At: 04/28/20 11:50:27To:
> > > >> > sodonn...@cloudera.com.invalid
> > > >> > Cc:  hdfs-dev@hadoop.apache.org
> > > >> > Subject: Re: Changing the default Datanode Volume Choosing policy
> > > >> >
> > > >> > +1 to switch it on in Hadoop 3.4.0
> > > >> >
> > > >> > (1) it doesn't break any existing applications I am aware of.
> > > >> > (2) No noticeable performance regression in any cases observed.
> > > >> >
> > > >> > I feel compelled to make a feature the default if it is strictly
> > > better.
> > > >> > Hopefully we can make Hadoop easier to use in this way too.
> > > >> >
> > > >> > On Tue, Apr 28, 2020 at 8:36 AM Stephen O'Donnell
> > > >> > <sodonn...@cloudera.com.invalid> wrote:
> > > >> >
> > > >> > > Hi,
> > > >> > >
> > > >> > > A long time back there was a Jira raised to change the default
> > > volume
> > > >> > > choosing policy from Round Robin to Available Space:
> > > >> > >
> > > >> > > https://issues.apache.org/jira/browse/HDFS-8538
> > > >> > >
> > > >> > > At the time there were some objections / concerns about using
> > > >> available
> > > >> > > space.
> > > >> > >
> > > >> > > In the 5 years since then, at Cloudera we have seen about 1000
> > > >> clusters
> > > >> > > running with Available Space enabled, and we have not seen any
> > > issues
> > > >> > > caused by it. It feels like this policy should be the default,
> as
> > we
> > > >> have
> > > >> > > to change it more often than not.
> > > >> > >
> > > >> > > To recap, the Available Space places blocks on disks with more
> > free
> > > >> space
> > > >> > > with a higher probability until all disks are within a threshold
> > of
> > > >> free
> > > >> > > space from each other. After that it behaves in a round robin
> > > fashion.
> > > >> > This
> > > >> > > means if a disk is replaced, it will slowly catch up to the
> usage
> > of
> > > >> the
> > > >> > > others, and if you have disks of different sizes, they will self
> > > >> balance.
> > > >> > >
> > > >> > > I would like to ask:
> > > >> > >
> > > >> > > 1. Are there others in the community running the Available Space
> > > >> volume
> > > >> > > choosing policy, and if so, have you seen any issues, or does it
> > run
> > > >> > > smoothly?
> > > >> > >
> > > >> > > 2. Does anyone have any strong objections in changing the
> default
> > to
> > > >> > > Available Space from 3.4 onwards?
> > > >> > >
> > > >> > > Thanks,
> > > >> > >
> > > >> > > Stephen.
> > > >> > >
> > > >> >
> > > >> >
> > > >> >
> > > >>
> > > >
> > >
> >
>

Reply via email to