Copilot commented on code in PR #17789:
URL: https://github.com/apache/pinot/pull/17789#discussion_r2899040009
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java:
##########
@@ -299,6 +299,18 @@ public void addSegment(ImmutableSegment segment) {
}
}
+ /**
+ * Creates a RecordInfoReader for the given segment. When comparison columns
are configured, reads comparison values
+ * from the columns. When comparison columns are empty, uses segment
creation time as the comparison value.
+ */
+ protected UpsertUtils.RecordInfoReader createRecordInfoReader(IndexSegment
segment) {
+ if (_comparisonColumns.isEmpty()) {
+ long segmentCreationTime = getAuthoritativeCreationTime(segment);
+ return new UpsertUtils.RecordInfoReader(segment, _primaryKeyColumns,
segmentCreationTime, _deleteRecordColumn);
+ }
+ return new UpsertUtils.RecordInfoReader(segment, _primaryKeyColumns,
_comparisonColumns, _deleteRecordColumn);
Review Comment:
`createRecordInfoReader()` is a good centralization, but this class still
constructs `new UpsertUtils.RecordInfoReader(..., _comparisonColumns, ...)`
directly in other code paths (e.g. revert/retry flows). If `_comparisonColumns`
is empty (segment-creation-time mode), those paths will bypass the
constant-comparison fallback and can break or behave inconsistently. Consider
updating all `RecordInfoReader` creation sites in this class to use
`createRecordInfoReader()` (or extend the helper to cover those call sites).
##########
pinot-core/src/main/java/org/apache/pinot/core/data/manager/BaseTableDataManager.java:
##########
@@ -784,9 +790,106 @@ public List<SegmentContext>
getSegmentContexts(List<IndexSegment> selectedSegmen
Map<String, String> queryOptions) {
List<SegmentContext> segmentContexts = new
ArrayList<>(selectedSegments.size());
selectedSegments.forEach(s -> segmentContexts.add(new SegmentContext(s)));
+ if (isUpsertEnabled() && !QueryOptionsUtils.isSkipUpsert(queryOptions)) {
+ _tableUpsertMetadataManager.setSegmentContexts(segmentContexts,
queryOptions);
+ }
return segmentContexts;
}
+ @Override
+ public boolean isUpsertEnabled() {
+ return _tableUpsertMetadataManager != null;
+ }
+
+ @VisibleForTesting
+ @Override
+ public TableUpsertMetadataManager getTableUpsertMetadataManager() {
+ return _tableUpsertMetadataManager;
+ }
+
+ @Override
+ public Map<Integer, Long> getPartitionToPrimaryKeyCount() {
+ if (isUpsertEnabled()) {
+ return _tableUpsertMetadataManager.getPartitionToPrimaryKeyCount();
+ }
+ return Collections.emptyMap();
+ }
+
+ protected void handleUpsert(ImmutableSegment immutableSegment, @Nullable
SegmentZKMetadata zkMetadata) {
+ String segmentName = immutableSegment.getSegmentName();
+ _logger.info("Adding immutable segment: {} with upsert enabled",
segmentName);
+
+ setZkCreationTimeIfAvailable(immutableSegment, zkMetadata);
+
+ Integer partitionId = SegmentUtils.getSegmentPartitionId(segmentName,
_tableNameWithType, _helixManager, null);
+ Preconditions.checkNotNull(partitionId,
+ "Failed to get partition id for segment: %s (upsert-enabled table:
%s). "
+ + "Segment must use LLCSegmentName or UploadedRealtimeSegmentName
naming convention, "
+ + "or have partition metadata in ZK.", segmentName,
_tableNameWithType);
Review Comment:
`handleUpsert()` always calls
`SegmentUtils.getSegmentPartitionId(segmentName, ...)`, which can trigger an
extra ZK read even when `zkMetadata` is already provided by the caller.
Consider using `SegmentUtils.getSegmentPartitionId(zkMetadata, ...)` when
`zkMetadata != null` (and falling back to the segmentName-based method
otherwise) to avoid redundant ZK calls and reduce the chance of failures when
ZK metadata isn’t yet readable.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]