github-actions[bot] commented on code in PR #64633:
URL: https://github.com/apache/doris/pull/64633#discussion_r3465721367
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/MergeLimits.java:
##########
@@ -54,7 +56,16 @@ public Rule build() {
}).toRule(RuleType.MERGE_LIMITS);
}
+ /**
+ * Merge two consecutive limits' offsets into a single offset. Consecutive
limits must be merged,
+ * and an overflowing combined offset cannot be represented as a single
offset, so fail fast
+ * instead of wrapping to a negative offset.
+ */
public static long mergeOffset(long upperOffset, long bottomOffset) {
+ if (Utils.addOverflows(upperOffset, bottomOffset)) {
Review Comment:
This makes valid nested limits fail during rewrite. For example:
```text
LogicalLimit(limit=10, offset=1)
LogicalLimit(limit=Long.MAX_VALUE, offset=Long.MAX_VALUE)
Scan
```
Both offsets are individually valid, but `MergeLimits` calls `mergeOffset(1,
Long.MAX_VALUE)` and now throws. The unmerged plan has well-defined semantics:
the inner offset already consumes every row a Doris plan can represent, so the
result is empty. The same applies whenever `bottomOffset + upperOffset` exceeds
`Long.MAX_VALUE`: no relation can have enough rows to survive both offsets.
Please do not turn this into a user-visible analysis error; skip the merge on
overflow, or merge to an empty limit such as `LIMIT 0 OFFSET 0`, instead of
throwing.
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PushDownLimitDistinctThroughJoin.java:
##########
@@ -64,6 +66,12 @@ public List<Rule> buildRules() {
logicalLimit(logicalAggregate(logicalProject(logicalJoin()).when(LogicalProject::isAllSlots))
.when(LogicalAggregate::isDistinct))
.then(limit -> {
+ // limit + offset overflowing means no child can
hold that many rows, so the
+ // push-down cannot reduce anything; skip it. (The
direct branch is gated the
+ // same way via topn_opt_limit_threshold.)
+ if (Utils.addOverflows(limit.getLimit(),
limit.getOffset())) {
Review Comment:
This branch still bypasses the session threshold used by the direct
limit-distinct join rewrite. The direct `Limit -> Aggregate(distinct) -> Join`
pattern checks `ConnectContext.get()`, `!addOverflows(...)`, and
`topnOptLimitThreshold >= limit + offset` before calling
`pushLimitThroughJoin`, but the `Limit -> Aggregate(distinct) ->
Project(allSlots) -> Join` branch added here only checks overflow and then
reaches the same helper.
For example, with the default `topn_opt_limit_threshold=1024`:
```text
LogicalLimit(limit=2048, offset=0)
LogicalAggregate(distinct)
LogicalProject(allSlots)
LeftOuterJoin(...)
```
can still push `LogicalLimit(limit=2048, offset=0)` plus the distinct
aggregate below the join, while the equivalent plan without the wrapper project
is rejected by the direct branch. Please apply the same `ConnectContext` and
threshold predicate here, or centralize that gate inside `pushLimitThroughJoin`
before constructing child limits.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]