github-actions[bot] commented on PR #61224:
URL: https://github.com/apache/doris/pull/61224#issuecomment-4044667969
## Code Review Summary — No Issues Found
### PR Goal
Strip redundant widening integer casts (e.g., `CAST(smallint_col AS INT)`)
in `SumLiteralRewrite.extractSumLiteral()` so that `SUM`/`COUNT` aggregates
operate on the original narrow column directly. This enables reuse of existing
`SUM(col)` when `SUM(col+1)`, `SUM(col+2)` etc. share the same base column
(ClickBench Q29 pattern).
### Critical Checkpoint Conclusions
1. **Does the code accomplish the goal? Is there a test that proves it?** —
Yes. The `stripWideningIntegerCast()` method correctly identifies and strips
implicit widening integer casts. Unit tests (`testStripWideningIntegerCast`,
`testStripWideningCastWithExistingSum`) and the updated regression test
(`sum_null_and_not_null_shape`) prove the optimization works. Result values in
`.out` are unchanged, confirming correctness.
2. **Is the modification minimal, clear, and focused?** — Yes. The change is
a single 30-line helper method + 1-line call site in `extractSumLiteral`. Scope
is well-contained.
3. **Correctness of the optimization:**
- **SUM return type is unaffected**: `Sum` always returns `BigIntType`
for all integer-like inputs (TinyInt/SmallInt/Int/BigInt) per its SIGNATURES
list. Stripping the cast does not change the aggregate result type.
- **COUNT is unaffected**: `Count` always returns BigInt and just counts
non-null values. A widening integer cast cannot introduce or remove NULLs, so
`COUNT(CAST(x AS INT))` ≡ `COUNT(x)` for widening casts.
- **Nullability is preserved**: `Cast.castNullable()` only introduces
nullability for *narrowing* integer casts (potential overflow). Widening casts
propagate nullability unchanged. So stripping the cast preserves the
nullability contract.
4. **Guard conditions are correct:**
- `isExplicitType()` check: User-written `CAST(...)` expressions (set to
`true` in `LogicalPlanBuilder.processCast`) are preserved. Only type-coercion
casts (always `false` via 2-arg Cast constructor) are stripped.
- `isIntegerLikeType()` check: Only strips casts between `IntegralType`
types excluding `LargeIntType` (128-bit). This is conservative and safe.
- `width() <=` check: Only strips widening (or same-width) casts, never
narrowing ones. Correct.
5. **Concurrency** — Not applicable. This is a pure rewrite rule operating
on immutable plan trees.
6. **Are there parallel code paths that need the same change?** — No.
`stripWideningIntegerCast` is called at the single extraction point where
`left` is obtained from the `Add`/`Subtract` decomposition. No other parallel
path exists in `SumLiteralRewrite`.
7. **Test coverage:**
- Unit test: `testStripWideningIntegerCast` covers implicit cast
stripping and explicit cast preservation.
`testStripWideningCastWithExistingSum` covers the ClickBench Q29 pattern with
pre-existing `SUM(col)` reuse.
- Regression test: `sum_null_and_not_null_shape` verifies the optimized
plan shape, `sum_null_and_not_null_result` confirms the numeric results are
unchanged.
- The tests cover both nullable (`id`) and non-nullable (`not_null_id`)
columns, which is important.
8. **No configuration, compatibility, or persistence concerns** — This is a
pure optimizer rewrite rule change with no new configs, no storage format
changes, and no FE-BE protocol changes.
9. **Performance** — This is strictly a performance improvement: narrower
column reads, fewer projections in the plan, and better aggregate reuse.
LGTM — clean, minimal, and well-tested optimization.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]