Copilot commented on code in PR #19390:
URL: https://github.com/apache/druid/pull/19390#discussion_r3176491588
##########
processing/src/main/java/org/apache/druid/query/aggregation/SimpleLongAggregatorFactory.java:
##########
@@ -130,6 +130,13 @@ protected boolean useGetObject(ColumnSelectorFactory
columnSelectorFactory)
return AggregatorUtil.shouldUseObjectColumnAggregatorWrapper(fieldName,
columnSelectorFactory);
}
+ @Override
+ @Nullable
+ protected String getInputColumn()
+ {
+ return expression == null ? fieldName : null;
+ }
Review Comment:
This hook enables the new Pooled TopN nullability shortcut for long
aggregators too, but the added coverage only exercises
`DoubleSumAggregatorFactory`. Since `LongSumAggregatorTest` already exists, the
long path should get the same hasNulls=false / unknown-nullability assertions;
otherwise a long-specific regression here would go unnoticed.
##########
processing/src/main/java/org/apache/druid/query/aggregation/SimpleFloatAggregatorFactory.java:
##########
@@ -124,6 +124,13 @@ protected boolean useGetObject(ColumnSelectorFactory
columnSelectorFactory)
return AggregatorUtil.shouldUseObjectColumnAggregatorWrapper(fieldName,
columnSelectorFactory);
}
+ @Override
+ @Nullable
+ protected String getInputColumn()
+ {
+ return expression == null ? fieldName : null;
+ }
Review Comment:
This opt-in hook now affects all float aggregators in Pooled TopN, but the
PR doesn't add any float-based coverage for `factorizeBufferedForPooledTopN` or
the pooled intermediate-size calculation. Without one concrete float-aggregator
test, a regression in this path won't be caught even though similar float
aggregators already have unit tests in this module.
##########
processing/src/main/java/org/apache/druid/query/topn/TopNQueryEngine.java:
##########
@@ -172,7 +172,10 @@ private TopNMapFn getMapFn(
int numBytesPerRecord = 0;
for (AggregatorFactory aggregatorFactory : query.getAggregatorSpecs()) {
- numBytesPerRecord += aggregatorFactory.getMaxIntermediateSizeWithNulls();
+ numBytesPerRecord +=
PooledTopNAlgorithm.getMaxIntermediateSizeWithNullsForPooledTopN(
+ cursorInspector.getColumnInspector(),
+ aggregatorFactory
+ );
Review Comment:
This changes pooled-vs-heap TopN selection to use the reduced Pooled TopN
intermediate size, but the PR doesn't add any TopN test that exercises a query
near the pooled-buffer or aggregate-metric-first thresholds.
`PooledTopNAlgorithmTest` currently only covers cleanup, so a regression here
could silently disable the optimization without failing tests.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]