Re: [PR] Enhancing metadata API to return upsert partition to primary key count map for both controller and server APIs [pinot]

via GitHub Mon, 29 Jan 2024 15:21:17 -0800


klsince commented on code in PR #12334:
URL: https://github.com/apache/pinot/pull/12334#discussion_r1470354363



##########
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/models/DummyTableUpsertMetadataManager.java:
##########
@@ -72,6 +73,11 @@ public PartitionUpsertMetadataManager 
getOrCreatePartitionManager(int partitionI
   public void stop() {
   }
 
+  @Override
+  public Map<Integer, Long> getPartitionToPrimaryKeyCount() {
+    return null;

Review Comment:
   if `null` is allowed, then can return null for 
RealtimeTableDataManager.getUpsertPartitionToPrimaryKeyCount() as well. And 
annotate the method with `@Nullable`



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/ConcurrentMapTableUpsertMetadataManager.java:
##########
@@ -45,6 +46,15 @@ public void stop() {
     }
   }
 
+  @Override
+  public Map<Integer, Long> getPartitionToPrimaryKeyCount() {
+    Map<Integer, Long> partitionToPrimaryKeyCount = new HashMap<>();
+    for (Integer partitionID : _partitionMetadataManagerMap.keySet()) {

Review Comment:
   can do _partitionMetadataManagerMap.forEach()



##########
pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/RealtimeTableDataManager.java:
##########
@@ -703,6 +704,18 @@ public TableUpsertMetadataManager 
getTableUpsertMetadataManager() {
     return _tableUpsertMetadataManager;
   }
 
+  /**
+   * Retrieves a mapping of partition id to the primary key count for the 
partition.
+   *
+   * @return A {@code Map} where keys are partition id and values are count of 
primary keys for that specific partition.
+   */
+  public Map<Integer, Long> getUpsertPartitionToPrimaryKeyCount() {
+    if (isUpsertEnabled()) {
+      return _tableUpsertMetadataManager.getPartitionToPrimaryKeyCount();
+    }
+    return new HashMap<>();

Review Comment:
   Collections.emptyMap()



##########
pinot-controller/src/main/java/org/apache/pinot/controller/util/ServerSegmentMetadataReader.java:
##########
@@ -140,16 +143,17 @@ public TableMetadataInfo 
getAggregatedTableMetadataFromServer(String tableNameWi
       return v;
     });
 
-    // Since table segments may have multiple replicas, divide 
diskSizeInBytes, numRows and numSegments by numReplica
-    // to avoid double counting, for columnAvgLengthMap, 
columnAvgCardinalityMap and maxNumMultiValuesMap, dividing by
-    // numReplica is not needed since totalNumSegments already contains 
replicas.
+    // Since table segments may have multiple replicas, divide 
diskSizeInBytes, numRows, numSegments and primary key
+    // count by numReplica to avoid double counting, for columnAvgLengthMap, 
columnAvgCardinalityMap and
+    // maxNumMultiValuesMap, dividing by numReplica is not needed since 
totalNumSegments already contains replicas.

Review Comment:
   how about keeping all partition replicas' count in the map? then, we can 
find out any discrepancy with this API



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Enhancing metadata API to return upsert partition to primary key count map for both controller and server APIs [pinot]

Reply via email to