[ 
https://issues.apache.org/jira/browse/KUDU-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17746594#comment-17746594
 ] 

ASF subversion and git services commented on KUDU-3476:
-------------------------------------------------------

Commit ffddad8f49f33a40a34a877fd44f1bd6be64440a in kudu's branch 
refs/heads/branch-1.17.x from Mahesh Reddy
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=ffddad8f4 ]

KUDU-3476: Make replica placement range and table aware

Previously, the replica selection policy randomly selected
two tablet servers and placed the replica on the tserver
with less replicas. This could lead to hotspotting if
placing replicas from the same range on the same set of
tservers since the policy doesn't discriminate by range.

With this patch, the replica selection policy now ranks
the available tservers by range and table load and places
the replica accordingly. It prioritizes replicas by range
first, replicas by table are used as a tiebreaker, then
total replicas is used as the final tiebreaker. The range
and table load is determined by the existing number of
replicas before the placement begins and the number of
pending replicas placed on the tserver while placing replicas.

The flag --enable_range_replica_placement on the master side
controls whether or not this new policy is used.
For this feature to work, both the range start key and the
table id of the table the range belongs to must be defined.
This is because multiple tables could have the same range
defined by the same range start key, so to
differentiate the ranges, the table id is required.

The link to the design doc is here:
https://docs.google.com/document/d/1r-p0GW8lj2iLA3VGvZWAem09ykCmR5jEe8npUhJ07G8/edit?usp=sharing

Change-Id: I9caeb8d5547e946bfeb152a99e1ec034c3fa0a0f
Reviewed-on: http://gerrit.cloudera.org:8080/19931
Tested-by: Alexey Serbin <ale...@apache.org>
Reviewed-by: Alexey Serbin <ale...@apache.org>
(cherry picked from commit 10fdaf6a93a4bd3289d162b5f8351c4f0f5928c8)
Reviewed-on: http://gerrit.cloudera.org:8080/20195
Reviewed-by: Yingchun Lai <laiyingc...@apache.org>


> Make replica placement range and table aware
> --------------------------------------------
>
>                 Key: KUDU-3476
>                 URL: https://issues.apache.org/jira/browse/KUDU-3476
>             Project: Kudu
>          Issue Type: New Feature
>          Components: master, tserver
>            Reporter: Mahesh Reddy
>            Assignee: Mahesh Reddy
>            Priority: Major
>             Fix For: 1.17.0
>
>
> The current replica placement algorithm uses the power of two choices 
> algorithm. This algorithm randomly selects tservers and places the replica on 
> the tserver with less replicas. This can lead to potential hotspotting as it 
> doesn't discriminate by range or table so many tablets from the same 
> range/table can be disproportionally distributed.
> With this new feature, the replicas will be placed in a way that the tablets 
> per range will be equally distributed amongst the available tservers. If 
> multiple tservers have the same amount of replicas per range, then the 
> tserver with less replicas for that table will be selected. If multiple 
> tservers have the same amount of replicas for that table, the tserver with 
> less total replicas will be chosen.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to