I am hoping Arpit Agarwal & Tsz-wo-Sze will comment here too, but I will
ping them directly if they do not.

5 years ago, when they raised those concerns, the feature was new and
little used. Their concerns, I think, were based on a theory that the
feature might not perform well. However since then the feature has proven
stable and trouble free. In supporting many of Cloudera's clusters over the
last 5 years and despite us having about 1000 clusters using this setting,
I don't recall a single issue caused by it. On the other hand, we fielded a
lot of support issues around default round robin policy, where smaller
disks filled up, needing to run the disk balancer etc.

As the feature seems to work well in practice, I would be inclined to leave
what appears to be stable as it is, and only make changes if we see issues
in real usage.

On Thu, Apr 30, 2020 at 10:03 AM Ayush Saxena <ayush...@gmail.com> wrote:

> Hey Stephen,
> Thanx for initiating this.
> Just had a look on HDFS-8538, Seems it had concerns couple of concerns
> regarding the write throughput and performance by Arpit Agarwal &
> Tsz-wo-Sze. It concluded with a solution in the end as mentioned here :
>
> https://issues.apache.org/jira/browse/HDFS-8538?focusedCommentId=14606094&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14606094
>
> Do you plan to incorporate the same and then continue or is the concern
> then raised isn't there now? Any pointers on those concerns and comments?
> Would be great if you get a nod from them too..
>
> Thanx
> -Ayush
>
> On Thu, 30 Apr 2020 at 09:32, Akira Ajisaka <aajis...@apache.org> wrote:
>
>> +1 to change the default policy in Hadoop 3.4+.
>>
>> -Akira
>>
>> On Wed, Apr 29, 2020 at 1:28 AM Clay Baenziger (BLOOMBERG/ 919 3RD A) <
>> cbaenzi...@bloomberg.net> wrote:
>>
>> > I can confirm that my group has run with Available Space for a number of
>> > years on the 2.7.x line quite successfully.
>> >
>> > -Clay
>> >
>> > From: weic...@cloudera.com.INVALID At: 04/28/20 11:50:27To:
>> > sodonn...@cloudera.com.invalid
>> > Cc:  hdfs-dev@hadoop.apache.org
>> > Subject: Re: Changing the default Datanode Volume Choosing policy
>> >
>> > +1 to switch it on in Hadoop 3.4.0
>> >
>> > (1) it doesn't break any existing applications I am aware of.
>> > (2) No noticeable performance regression in any cases observed.
>> >
>> > I feel compelled to make a feature the default if it is strictly better.
>> > Hopefully we can make Hadoop easier to use in this way too.
>> >
>> > On Tue, Apr 28, 2020 at 8:36 AM Stephen O'Donnell
>> > <sodonn...@cloudera.com.invalid> wrote:
>> >
>> > > Hi,
>> > >
>> > > A long time back there was a Jira raised to change the default volume
>> > > choosing policy from Round Robin to Available Space:
>> > >
>> > > https://issues.apache.org/jira/browse/HDFS-8538
>> > >
>> > > At the time there were some objections / concerns about using
>> available
>> > > space.
>> > >
>> > > In the 5 years since then, at Cloudera we have seen about 1000
>> clusters
>> > > running with Available Space enabled, and we have not seen any issues
>> > > caused by it. It feels like this policy should be the default, as we
>> have
>> > > to change it more often than not.
>> > >
>> > > To recap, the Available Space places blocks on disks with more free
>> space
>> > > with a higher probability until all disks are within a threshold of
>> free
>> > > space from each other. After that it behaves in a round robin fashion.
>> > This
>> > > means if a disk is replaced, it will slowly catch up to the usage of
>> the
>> > > others, and if you have disks of different sizes, they will self
>> balance.
>> > >
>> > > I would like to ask:
>> > >
>> > > 1. Are there others in the community running the Available Space
>> volume
>> > > choosing policy, and if so, have you seen any issues, or does it run
>> > > smoothly?
>> > >
>> > > 2. Does anyone have any strong objections in changing the default to
>> > > Available Space from 3.4 onwards?
>> > >
>> > > Thanks,
>> > >
>> > > Stephen.
>> > >
>> >
>> >
>> >
>>
>

Reply via email to