yashmayya opened a new pull request, #15930:
URL: https://github.com/apache/pinot/pull/15930

   - Currently, the minimize data movement algorithm for instance assignment 
isn't very effective for realtime tables where the number of partitions per 
replica group configured in the instance assignment config is 0 / 1 (or 
practically anything that goes out of sync with the actual number of stream 
partitions).
   - See the scenario described in https://github.com/apache/pinot/issues/14151 
for instance. If `numPartitions` in `replicaGroupPartitionConfig` in the 
instance assignment config is set to 0 / 1, what happens is that each replica 
group is created with a single partition and the actual stream partitions are 
assigned uniformly across the instances using a simple modulo based mechanism 
(see 
[here](https://github.com/apache/pinot/blob/cc3eb2e9c296761aa3db7e7e6a9fa7d66bd213bc/pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/RealtimeSegmentAssignment.java#L147-L159)).
 Due to this, in cases where there are scale ups or scale downs, a number of 
partitions could be moved across instances even when `minimizeDataMovement` is 
enabled for the rebalance. For many user scenarios it could be inconvenient to 
have to always specify an explicit number of partitions in the instance 
assignment config and keep it updated based on any repartitioning upstream.
   - This patch introduces a new `ImplicitRealtimeTablePartitionSelector` 
strategy for instance assignment, where the number of partitions within each 
replica group is determined using the number of partitions from the source 
stream for a realtime table. It also enforces the use of 1 instance per 
partition (per replica group) for upsert tables.
   - Note that repartitioning upstream won't be automatically detected; 
instead, a rebalance (with reassign instances enabled) should be triggered 
which will update the `INSTANCE_PARTITIONS` as well as the segment assignment.
   - A test has been added to demonstrate how this strategy works in a scenario 
where the source stream is repartitioned and then new instances are added to 
take over the new partitions (with older partitions not being moved across 
instances).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to