I can confirm that my group has run with Available Space for a number of years 
on the 2.7.x line quite successfully.

-Clay

From: weic...@cloudera.com.INVALID At: 04/28/20 11:50:27To:  
sodonn...@cloudera.com.invalid
Cc:  hdfs-dev@hadoop.apache.org
Subject: Re: Changing the default Datanode Volume Choosing policy

+1 to switch it on in Hadoop 3.4.0

(1) it doesn't break any existing applications I am aware of.
(2) No noticeable performance regression in any cases observed.

I feel compelled to make a feature the default if it is strictly better.
Hopefully we can make Hadoop easier to use in this way too.

On Tue, Apr 28, 2020 at 8:36 AM Stephen O'Donnell
<sodonn...@cloudera.com.invalid> wrote:

> Hi,
>
> A long time back there was a Jira raised to change the default volume
> choosing policy from Round Robin to Available Space:
>
> https://issues.apache.org/jira/browse/HDFS-8538
>
> At the time there were some objections / concerns about using available
> space.
>
> In the 5 years since then, at Cloudera we have seen about 1000 clusters
> running with Available Space enabled, and we have not seen any issues
> caused by it. It feels like this policy should be the default, as we have
> to change it more often than not.
>
> To recap, the Available Space places blocks on disks with more free space
> with a higher probability until all disks are within a threshold of free
> space from each other. After that it behaves in a round robin fashion. This
> means if a disk is replaced, it will slowly catch up to the usage of the
> others, and if you have disks of different sizes, they will self balance.
>
> I would like to ask:
>
> 1. Are there others in the community running the Available Space volume
> choosing policy, and if so, have you seen any issues, or does it run
> smoothly?
>
> 2. Does anyone have any strong objections in changing the default to
> Available Space from 3.4 onwards?
>
> Thanks,
>
> Stephen.
>


Reply via email to