[ 
https://issues.apache.org/jira/browse/KUDU-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17903936#comment-17903936
 ] 

Alexey Serbin commented on KUDU-3476:
-------------------------------------

Hi [~derek.huang],

{noformat}
Hi, I am wondering is there a plan to include this strategy in rebalancers? It 
can really help solve the hotspotting issue. 
{noformat}

The range-aware rebalancing is already available in the [kudu cluster rebalance 
CLI 
tool|https://kudu.apache.org/docs/command_line_tools_reference.html#cluster-rebalance],
 but only on per-table basis as of now.  To rebalance a whole cluster, it's be 
necessary to run the tool for each table with the following extra options:
{noformat}
--enable_range_rebalancing
--tables=<name_of_a_table_with_range_hotspotting_issue>
{noformat}

It's feasible to put together a shell script to range-rebalance all the tables 
in a Kudu cluster, involving other CLI tools such as [kudu table 
list|https://kudu.apache.org/docs/command_line_tools_reference.html#table-list]

Extending the functionality of the tool to loop through all the cluster's 
tables itself should be a relatively simple project.  AFAIK, nobody is working 
on this yet.  If you'd like to contribute to Kudu by any chance, please feel 
free to work on this.  The contribution guideline for the Apache Kudu project 
is [available here|https://kudu.apache.org/docs/contributing.html].

Thank you!

> Make replica placement range and table aware
> --------------------------------------------
>
>                 Key: KUDU-3476
>                 URL: https://issues.apache.org/jira/browse/KUDU-3476
>             Project: Kudu
>          Issue Type: New Feature
>          Components: master, tserver
>            Reporter: Mahesh Reddy
>            Assignee: Mahesh Reddy
>            Priority: Major
>             Fix For: 1.17.0
>
>
> The current replica placement algorithm uses the power of two choices 
> algorithm. This algorithm randomly selects tservers and places the replica on 
> the tserver with less replicas. This can lead to potential hotspotting as it 
> doesn't discriminate by range or table so many tablets from the same 
> range/table can be disproportionally distributed.
> With this new feature, the replicas will be placed in a way that the tablets 
> per range will be equally distributed amongst the available tservers. If 
> multiple tservers have the same amount of replicas per range, then the 
> tserver with less replicas for that table will be selected. If multiple 
> tservers have the same amount of replicas for that table, the tserver with 
> less total replicas will be chosen.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to