yashmayya opened a new pull request, #16140:
URL: https://github.com/apache/pinot/pull/16140
- Currently, if a table rebalance results in instance reassignment but no
segment rebalance, we end up writing some incorrect rebalance progress stats to
ZK. For instance (notice `startTimeMs` and `timeToFinishInSeconds`):
```
{
"id": "/CONTROLLER_JOBS/TABLE_REBALANCE",
"simpleFields": {},
"mapFields": {
"7d45b962-c001-4eec-a54e-c0ed3a791d31": {
"jobId": "7d45b962-c001-4eec-a54e-c0ed3a791d31",
"submissionTimeMs": "1750238019928",
"jobType": "TABLE_REBALANCE",
"REBALANCE_PROGRESS_STATS":
"{\"status\":\"DONE\",\"startTimeMs\":0,\"timeToFinishInSeconds\":1750238019,\"completionStatusMsg\":\"Instance
reassigned but table is already
balanced\",\"rebalanceProgressStatsOverall\":{\"totalSegmentsToBeAdded\":0,\"totalSegmentsToBeDeleted\":0,\"totalRemainingSegmentsToBeAdded\":0,\"totalRemainingSegmentsToBeDeleted\":0,\"totalRemainingSegmentsToConverge\":0,\"totalCarryOverSegmentsToBeAdded\":0,\"totalCarryOverSegmentsToBeDeleted\":0,\"totalUniqueNewUntrackedSegmentsDuringRebalance\":0,\"percentageRemainingSegmentsToBeAdded\":0.0,\"percentageRemainingSegmentsToBeDeleted\":0.0,\"estimatedTimeToCompleteAddsInSeconds\":0.0,\"estimatedTimeToCompleteDeletesInSeconds\":0.0,\"averageSegmentSizeInBytes\":0,\"totalEstimatedDataToBeMovedInBytes\":0,\"startTimeMs\":0},\"rebalanceProgressStatsCurrentStep\":{\"totalSegmentsToBeAdded\":0,\"totalSegmentsToBeDeleted\":0,\"totalRemainingSegmentsToBeAdded\":0,\"totalRemainingSegmentsToBeDeleted\":0,\"totalRe
mainingSegmentsToConverge\":0,\"totalCarryOverSegmentsToBeAdded\":0,\"totalCarryOverSegmentsToBeDeleted\":0,\"totalUniqueNewUntrackedSegmentsDuringRebalance\":0,\"percentageRemainingSegmentsToBeAdded\":0.0,\"percentageRemainingSegmentsToBeDeleted\":0.0,\"estimatedTimeToCompleteAddsInSeconds\":0.0,\"estimatedTimeToCompleteDeletesInSeconds\":0.0,\"averageSegmentSizeInBytes\":0,\"totalEstimatedDataToBeMovedInBytes\":0,\"startTimeMs\":0},\"initialToTargetStateConvergence\":{\"_segmentsMissing\":0,\"_segmentsToRebalance\":0,\"_percentSegmentsToRebalance\":0.0,\"_replicasToRebalance\":0},\"currentToTargetConvergence\":{\"_segmentsMissing\":0,\"_segmentsToRebalance\":0,\"_percentSegmentsToRebalance\":0.0,\"_replicasToRebalance\":0},\"externalViewToIdealStateConvergence\":{\"_segmentsMissing\":0,\"_segmentsToRebalance\":0,\"_percentSegmentsToRebalance\":0.0,\"_replicasToRebalance\":0}}",
"REBALANCE_CONTEXT":
"{\"attemptId\":1,\"jobId\":\"7d45b962-c001-4eec-a54e-c0ed3a791d31\",\"config\":{\"maxAttempts\":3,\"bestEfforts\":false,\"downtime\":false,\"bootstrap\":false,\"dryRun\":false,\"preChecks\":false,\"lowDiskMode\":false,\"includeConsuming\":true,\"updateTargetTier\":false,\"batchSizePerServer\":-1,\"reassignInstances\":true,\"externalViewStabilizationTimeoutInMs\":3600000,\"minimizeDataMovement\":\"ENABLE\",\"externalViewCheckIntervalInMs\":1000,\"minAvailableReplicas\":-1,\"heartbeatIntervalInMs\":300000,\"heartbeatTimeoutInMs\":3600000,\"retryInitialDelayInMs\":300000},\"originalJobId\":\"7d45b962-c001-4eec-a54e-c0ed3a791d31\",\"allowRetries\":true}",
"tableName": "upsertMeetupRsvp_REALTIME"
}
},
"listFields": {}
}
```
- The reason is that we're calling `TableRebalanceObserver::onSuccess`
without ever calling `TableRebalanceObserver::onTrigger` with the
`START_TRIGGER`.
- If instances are reassigned and there's no actual segment rebalance being
done, there's no reason to persist stats in ZK, and the result can simply be
returned to the user directly.
- The other cases where we're calling some `TableRebalanceObserver` method
before the start trigger are:
- Segment assignment and instance assignment are both unchanged. In this
case, the dry run rebalance before the actual rebalance will be a no-op and we
won't run the actual rebalance itself at all (see
[here](https://github.com/apache/pinot/blob/a91d6af17c651f139a4fdcc0e090de3c91eb8b8a/pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTableRestletResource.java#L709-L749)).
So we won't store any stats in ZK for this case.
- Downtime rebalance - we don't use ZK-based progress tracking for these
rebalances (see
[here](https://github.com/apache/pinot/blob/a91d6af17c651f139a4fdcc0e090de3c91eb8b8a/pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTableRestletResource.java#L705-L709)).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]