klsince commented on code in PR #12334:
URL: https://github.com/apache/pinot/pull/12334#discussion_r1470354363
##########
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/models/DummyTableUpsertMetadataManager.java:
##########
@@ -72,6 +73,11 @@ public PartitionUpsertMetadataManager
getOrCreatePartitionManager(int partitionI
public void stop() {
}
+ @Override
+ public Map<Integer, Long> getPartitionToPrimaryKeyCount() {
+ return null;
Review Comment:
if `null` is allowed, then can return null for
RealtimeTableDataManager.getUpsertPartitionToPrimaryKeyCount() as well. And
annotate the method with `@Nullable`
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/ConcurrentMapTableUpsertMetadataManager.java:
##########
@@ -45,6 +46,15 @@ public void stop() {
}
}
+ @Override
+ public Map<Integer, Long> getPartitionToPrimaryKeyCount() {
+ Map<Integer, Long> partitionToPrimaryKeyCount = new HashMap<>();
+ for (Integer partitionID : _partitionMetadataManagerMap.keySet()) {
Review Comment:
can do _partitionMetadataManagerMap.forEach()
##########
pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/RealtimeTableDataManager.java:
##########
@@ -703,6 +704,18 @@ public TableUpsertMetadataManager
getTableUpsertMetadataManager() {
return _tableUpsertMetadataManager;
}
+ /**
+ * Retrieves a mapping of partition id to the primary key count for the
partition.
+ *
+ * @return A {@code Map} where keys are partition id and values are count of
primary keys for that specific partition.
+ */
+ public Map<Integer, Long> getUpsertPartitionToPrimaryKeyCount() {
+ if (isUpsertEnabled()) {
+ return _tableUpsertMetadataManager.getPartitionToPrimaryKeyCount();
+ }
+ return new HashMap<>();
Review Comment:
Collections.emptyMap()
##########
pinot-controller/src/main/java/org/apache/pinot/controller/util/ServerSegmentMetadataReader.java:
##########
@@ -140,16 +143,17 @@ public TableMetadataInfo
getAggregatedTableMetadataFromServer(String tableNameWi
return v;
});
- // Since table segments may have multiple replicas, divide
diskSizeInBytes, numRows and numSegments by numReplica
- // to avoid double counting, for columnAvgLengthMap,
columnAvgCardinalityMap and maxNumMultiValuesMap, dividing by
- // numReplica is not needed since totalNumSegments already contains
replicas.
+ // Since table segments may have multiple replicas, divide
diskSizeInBytes, numRows, numSegments and primary key
+ // count by numReplica to avoid double counting, for columnAvgLengthMap,
columnAvgCardinalityMap and
+ // maxNumMultiValuesMap, dividing by numReplica is not needed since
totalNumSegments already contains replicas.
Review Comment:
how about keeping all partition replicas' count in the map? then, we can
find out any discrepancy with this API
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]