Copilot commented on code in PR #17789:
URL: https://github.com/apache/pinot/pull/17789#discussion_r2899040009


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java:
##########
@@ -299,6 +299,18 @@ public void addSegment(ImmutableSegment segment) {
     }
   }
 
+  /**
+   * Creates a RecordInfoReader for the given segment. When comparison columns 
are configured, reads comparison values
+   * from the columns. When comparison columns are empty, uses segment 
creation time as the comparison value.
+   */
+  protected UpsertUtils.RecordInfoReader createRecordInfoReader(IndexSegment 
segment) {
+    if (_comparisonColumns.isEmpty()) {
+      long segmentCreationTime = getAuthoritativeCreationTime(segment);
+      return new UpsertUtils.RecordInfoReader(segment, _primaryKeyColumns, 
segmentCreationTime, _deleteRecordColumn);
+    }
+    return new UpsertUtils.RecordInfoReader(segment, _primaryKeyColumns, 
_comparisonColumns, _deleteRecordColumn);

Review Comment:
   `createRecordInfoReader()` is a good centralization, but this class still 
constructs `new UpsertUtils.RecordInfoReader(..., _comparisonColumns, ...)` 
directly in other code paths (e.g. revert/retry flows). If `_comparisonColumns` 
is empty (segment-creation-time mode), those paths will bypass the 
constant-comparison fallback and can break or behave inconsistently. Consider 
updating all `RecordInfoReader` creation sites in this class to use 
`createRecordInfoReader()` (or extend the helper to cover those call sites).



##########
pinot-core/src/main/java/org/apache/pinot/core/data/manager/BaseTableDataManager.java:
##########
@@ -784,9 +790,106 @@ public List<SegmentContext> 
getSegmentContexts(List<IndexSegment> selectedSegmen
       Map<String, String> queryOptions) {
     List<SegmentContext> segmentContexts = new 
ArrayList<>(selectedSegments.size());
     selectedSegments.forEach(s -> segmentContexts.add(new SegmentContext(s)));
+    if (isUpsertEnabled() && !QueryOptionsUtils.isSkipUpsert(queryOptions)) {
+      _tableUpsertMetadataManager.setSegmentContexts(segmentContexts, 
queryOptions);
+    }
     return segmentContexts;
   }
 
+  @Override
+  public boolean isUpsertEnabled() {
+    return _tableUpsertMetadataManager != null;
+  }
+
+  @VisibleForTesting
+  @Override
+  public TableUpsertMetadataManager getTableUpsertMetadataManager() {
+    return _tableUpsertMetadataManager;
+  }
+
+  @Override
+  public Map<Integer, Long> getPartitionToPrimaryKeyCount() {
+    if (isUpsertEnabled()) {
+      return _tableUpsertMetadataManager.getPartitionToPrimaryKeyCount();
+    }
+    return Collections.emptyMap();
+  }
+
+  protected void handleUpsert(ImmutableSegment immutableSegment, @Nullable 
SegmentZKMetadata zkMetadata) {
+    String segmentName = immutableSegment.getSegmentName();
+    _logger.info("Adding immutable segment: {} with upsert enabled", 
segmentName);
+
+    setZkCreationTimeIfAvailable(immutableSegment, zkMetadata);
+
+    Integer partitionId = SegmentUtils.getSegmentPartitionId(segmentName, 
_tableNameWithType, _helixManager, null);
+    Preconditions.checkNotNull(partitionId,
+        "Failed to get partition id for segment: %s (upsert-enabled table: 
%s). "
+            + "Segment must use LLCSegmentName or UploadedRealtimeSegmentName 
naming convention, "
+            + "or have partition metadata in ZK.", segmentName, 
_tableNameWithType);

Review Comment:
   `handleUpsert()` always calls 
`SegmentUtils.getSegmentPartitionId(segmentName, ...)`, which can trigger an 
extra ZK read even when `zkMetadata` is already provided by the caller. 
Consider using `SegmentUtils.getSegmentPartitionId(zkMetadata, ...)` when 
`zkMetadata != null` (and falling back to the segmentName-based method 
otherwise) to avoid redundant ZK calls and reduce the chance of failures when 
ZK metadata isn’t yet readable.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to