924060929 opened a new pull request, #64633:
URL: https://github.com/apache/doris/pull/64633

   ## Proposed changes
   
   When pushing a `TopN`/`Limit` down through `Union`/`Join`/`Window`, the 
child operator's limit is
   computed as `limit + offset`. Both are non-negative `long`s, so when they 
are close to `BIGINT_MAX`
   (e.g. `LIMIT 9223372036854775807 OFFSET 9223372036854775807`) the addition 
overflows the `long` range
   and wraps to a negative value.
   
   A negative limit is an illegal plan. On the BE side it is reinterpreted as a 
huge unsigned value
   (`uint64_t limit = _offset + _limit` in the sorter), so a trivial query that 
should immediately return
   an empty set instead runs until it hits the query timeout.
   
   ### Minimal reproducer (no table required)
   
   ```sql
   select count(*) as c from (
       select id from (
           select 1 as id union all select 2 as id union all select 3 as id
       ) t
       order by id limit 9223372036854775807 offset 9223372036854775807
   ) s;
   ```
   
   - Original planner, or Nereids with `PUSH_DOWN_TOP_N_THROUGH_UNION` 
disabled: returns `0` immediately
     (correct — the offset is far beyond the 3 input rows).
   - Nereids with the rule enabled: times out.
   
   ### Fix
   
   Add `Utils.saturatedAdd(long, long)`, which clamps to `Long.MAX_VALUE` on 
positive overflow instead of
   wrapping, and use it everywhere a child limit is derived from `limit + 
offset`:
   
   - `PushDownTopNThroughUnion` / `PushDownTopNDistinctThroughUnion`
   - `PushDownTopNThroughJoin` / `PushDownTopNDistinctThroughJoin`
   - `PushDownTopNThroughWindow`
   - `SplitLimit`
   
   `Long.MAX_VALUE` ("all rows") is the semantically correct upper bound: no 
relation can hold more than
   `Long.MAX_VALUE` rows, so the pushed-down limit never drops rows the parent 
may need, and the parent
   operator still applies the real `limit`/`offset`. For non-overflowing inputs 
the behavior is unchanged.
   
   ### Tests
   
   - `UtilsTest#testSaturatedAdd` covers normal, positive-overflow and 
negative-overflow cases.
   - A regression case in `push_down_top_n_through_union` asserts the 
reproducer returns an empty result
     (count `0`) without timing out.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to