github-actions[bot] commented on code in PR #64335:
URL: https://github.com/apache/doris/pull/64335#discussion_r3503025648


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/SimplifyAggGroupBy.java:
##########
@@ -81,7 +81,56 @@ protected static boolean 
isBinaryArithmeticSlot(TreeNode<Expression> expr) {
         if (!supportedFunctions.contains(expr.getClass())) {
             return false;
         }
-        return ExpressionUtils.isSlotOrCastOnSlot(expr.child(0)).isPresent() 
&& expr.child(1) instanceof Literal
-                || 
ExpressionUtils.isSlotOrCastOnSlot(expr.child(1)).isPresent() && expr.child(0) 
instanceof Literal;
+
+        // Float/double arithmetic: precision loss for all operations
+        if (expr.child(0).getDataType().isFloatLikeType()
+                || expr.child(1).getDataType().isFloatLikeType()) {
+            return false;
+        }
+
+        Expression slotExpr;
+        Literal literal;
+        if (expr.child(0) instanceof Literal) {
+            literal = (Literal) expr.child(0);
+            slotExpr = expr.child(1);
+        } else if (expr.child(1) instanceof Literal) {
+            literal = (Literal) expr.child(1);
+            slotExpr = expr.child(0);
+        } else {
+            return false;
+        }
+
+        if (!canExtractSlot(slotExpr)) {
+            return false;
+        }
+
+        return checkLiteral(expr, literal);
     }
+
+    @VisibleForTesting
+    protected static boolean checkLiteral(Expression expr, Literal literal) {
+        if (literal.isNullLiteral()) {
+            return false;
+        }

Review Comment:
   Checking only `literal.isZero()` still assumes every non-zero 
multiply/divide literal is injective, but both arithmetic families can collapse 
keys after analysis. For a `BIGINT k`, `processBinaryArithmetic()` keeps `k * 
2` as `BIGINT` because `DataType.promotion()` has no `BigIntType` entry, and 
the BE integral multiply returns the same primitive type and computes in that 
type, so boundary values such as `0` and `Long.MIN_VALUE` can produce the same 
`(k * 2, k * 4)` key tuple while this rewrite groups by raw `k`. Decimal 
division has the same problem: `DECIMAL(10,0) d / 100000.0` is analyzed as an 
injective cast of `d` to a wider decimal plus a non-zero decimal literal, but 
the result is stored at finite scale, so adjacent values like `d = 0` and `d = 
1` can both evaluate to `0.0000` for large divisors and then get split after 
the rewrite. Please restrict multiply/divide to operations proven injective in 
the executed type instead of accepting every non-zero literal.



##########
fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/SimplifyAggGroupByTest.java:
##########
@@ -156,4 +169,231 @@ void testisBinaryArithmeticSlot() {
         Divide divide = new Divide(id, Literal.of(2));
         
Assertions.assertTrue(SimplifyAggGroupBy.isBinaryArithmeticSlot(divide));
     }
+
+    // ========== new tests for injectivity checks ==========
+
+    @Test
+    void testMultiplyByZero() {
+        Slot id = scan1.getOutput().get(0);
+        Assertions.assertFalse(SimplifyAggGroupBy.isBinaryArithmeticSlot(
+                new Multiply(id, Literal.of(0))));
+        Assertions.assertFalse(SimplifyAggGroupBy.isBinaryArithmeticSlot(
+                new Multiply(Literal.of(0), id)));
+    }
+
+    @Test
+    void testDivideZeroNumerator() {
+        Slot id = scan1.getOutput().get(0);
+        Assertions.assertFalse(SimplifyAggGroupBy.isBinaryArithmeticSlot(
+                new Divide(Literal.of(0), id)));
+    }
+
+    @Test
+    void testDivideByZero() {
+        Slot id = scan1.getOutput().get(0);
+        Assertions.assertFalse(SimplifyAggGroupBy.isBinaryArithmeticSlot(
+                new Divide(id, Literal.of(0))));
+    }
+
+    @Test
+    void testNullLiteral() {
+        Slot id = scan1.getOutput().get(0);
+        Assertions.assertFalse(SimplifyAggGroupBy.isBinaryArithmeticSlot(
+                new Add(id, NullLiteral.INSTANCE)));
+        Assertions.assertFalse(SimplifyAggGroupBy.isBinaryArithmeticSlot(
+                new Multiply(id, NullLiteral.INSTANCE)));
+    }
+
+    @Test
+    void testMultiplyWithDoubleLiteral() {
+        Slot id = scan1.getOutput().get(0);
+        Assertions.assertFalse(SimplifyAggGroupBy.isBinaryArithmeticSlot(
+                new Multiply(id, new DoubleLiteral(0.1))));
+    }
+
+    @Test
+    void testDivideWithDoubleLiteral() {
+        Slot id = scan1.getOutput().get(0);
+        Assertions.assertFalse(SimplifyAggGroupBy.isBinaryArithmeticSlot(
+                new Divide(id, new DoubleLiteral(2.0))));
+    }
+
+    @Test
+    void testMultiplyWithFloatSlot() {
+        Slot floatSlot = new SlotReference("f", FloatType.INSTANCE);
+        Assertions.assertFalse(SimplifyAggGroupBy.isBinaryArithmeticSlot(
+                new Multiply(floatSlot, Literal.of(2))));
+    }
+
+    @Test
+    void testMultiplyDoubleSlotWithIntLiteral() {
+        Slot doubleSlot = new SlotReference("d", DoubleType.INSTANCE);
+        Assertions.assertFalse(SimplifyAggGroupBy.isBinaryArithmeticSlot(
+                new Multiply(doubleSlot, Literal.of(2))));
+    }
+
+    @Test
+    void testAddWithDoubleLiteral() {
+        // Float/double arithmetic may be imprecise, reject for all ops
+        Slot id = scan1.getOutput().get(0);
+        Assertions.assertFalse(SimplifyAggGroupBy.isBinaryArithmeticSlot(
+                new Add(id, new DoubleLiteral(1.0))));
+    }
+
+    @Test
+    void testAddWithFloatLiteral() {
+        Slot id = scan1.getOutput().get(0);
+        Assertions.assertFalse(SimplifyAggGroupBy.isBinaryArithmeticSlot(
+                new Add(id, new FloatLiteral(1.0f))));
+    }
+
+    @Test
+    void testSubtractWithDoubleLiteral() {
+        Slot id = scan1.getOutput().get(0);
+        Assertions.assertFalse(SimplifyAggGroupBy.isBinaryArithmeticSlot(
+                new Subtract(id, new DoubleLiteral(1.0))));
+    }
+
+    @Test
+    void testMultiplyWithDecimalLiteral() {
+        // Small decimal multiply should pass (precision fits)
+        Slot id = scan1.getOutput().get(0);
+        Assertions.assertTrue(SimplifyAggGroupBy.isBinaryArithmeticSlot(
+                new Multiply(id, new DecimalLiteral(new BigDecimal("2.0")))));
+    }
+
+    @Test
+    void testDivideWithDecimalLiteral() {
+        // Divide with decimal: precision overflow too extreme to worry about
+        Slot id = scan1.getOutput().get(0);
+        Assertions.assertTrue(SimplifyAggGroupBy.isBinaryArithmeticSlot(
+                new Divide(id, new DecimalLiteral(new BigDecimal("2.0")))));
+    }
+
+    @Test
+    void testAddWithDecimalLiteral() {
+        // Add/Subtract with decimal are exact, should pass
+        Slot id = scan1.getOutput().get(0);
+        Assertions.assertTrue(SimplifyAggGroupBy.isBinaryArithmeticSlot(
+                new Add(id, new DecimalLiteral(new BigDecimal("1.0")))));
+    }
+
+    // ========== tests for isInjectiveCastTo ==========
+
+    @Test
+    void testIntegerWidening() {
+        
Assertions.assertTrue(TinyIntType.INSTANCE.isInjectiveCastTo(IntegerType.INSTANCE));
+        
Assertions.assertTrue(IntegerType.INSTANCE.isInjectiveCastTo(BigIntType.INSTANCE));
+        
Assertions.assertFalse(IntegerType.INSTANCE.isInjectiveCastTo(TinyIntType.INSTANCE));
+        
Assertions.assertFalse(BigIntType.INSTANCE.isInjectiveCastTo(IntegerType.INSTANCE));
+    }
+
+    @Test
+    void testDecimalWidening() {
+        Assertions.assertTrue(DecimalV3Type.createDecimalV3Type(5, 2)
+                .isInjectiveCastTo(DecimalV3Type.createDecimalV3Type(10, 4)));
+        Assertions.assertFalse(DecimalV3Type.createDecimalV3Type(10, 4)
+                .isInjectiveCastTo(DecimalV3Type.createDecimalV3Type(5, 2)));
+    }
+
+    @Test
+    void testIntegralToDecimalWidening() {
+        Assertions.assertTrue(TinyIntType.INSTANCE
+                .isInjectiveCastTo(DecimalV3Type.createDecimalV3Type(10, 0)));
+        // BigInt has 19 digits, DECIMAL(5,0) only has 5 integer digits
+        Assertions.assertFalse(BigIntType.INSTANCE
+                .isInjectiveCastTo(DecimalV3Type.createDecimalV3Type(5, 0)));
+    }
+
+    @Test
+    void testCrossFamilyRejected() {
+        
Assertions.assertFalse(IntegerType.INSTANCE.isInjectiveCastTo(FloatType.INSTANCE));
+        
Assertions.assertFalse(FloatType.INSTANCE.isInjectiveCastTo(IntegerType.INSTANCE));
+        
Assertions.assertFalse(IntegerType.INSTANCE.isInjectiveCastTo(DoubleType.INSTANCE));
+    }
+
+    // ========== tests for canExtractSlot ==========
+
+    @Test
+    void testCanExtractSlotBare() {
+        Slot id = scan1.getOutput().get(0);
+        Assertions.assertTrue(SimplifyAggGroupBy.canExtractSlot(id));
+    }
+

Review Comment:
   These cast cases only call `canExtractSlot()` directly, so they do not prove 
the rule still behaves correctly on the analyzed aggregate tree where casts and 
arithmetic coercions have already been inserted. Since the PR relies on 
`isInjectiveCastTo()` to decide whether casted group keys can be collapsed, 
please add an analyzer/PlanChecker or regression case that exercises `GROUP BY 
cast(slot AS wider_type) + literal` through the real rewrite path, and a 
narrowing/non-injective cast case that remains unsimplified.



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/util/ExpressionUtils.java:
##########
@@ -422,40 +421,6 @@ public static <S extends NamedExpression> S 
selectMinimumColumn(Collection<S> sl
     }
 

Review Comment:
   The deletion left the old Javadoc opener behind, so this method now has two 
consecutive `/**` lines and the actual generated Javadoc starts with a literal 
`/**`. Please remove the stale opener when deleting the old helper comment; 
otherwise the Java style/Javadoc checks can fail or produce malformed generated 
docs.



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/SimplifyAggGroupBy.java:
##########
@@ -81,7 +81,56 @@ protected static boolean 
isBinaryArithmeticSlot(TreeNode<Expression> expr) {
         if (!supportedFunctions.contains(expr.getClass())) {
             return false;
         }
-        return ExpressionUtils.isSlotOrCastOnSlot(expr.child(0)).isPresent() 
&& expr.child(1) instanceof Literal
-                || 
ExpressionUtils.isSlotOrCastOnSlot(expr.child(1)).isPresent() && expr.child(0) 
instanceof Literal;
+
+        // Float/double arithmetic: precision loss for all operations
+        if (expr.child(0).getDataType().isFloatLikeType()
+                || expr.child(1).getDataType().isFloatLikeType()) {
+            return false;
+        }
+
+        Expression slotExpr;
+        Literal literal;
+        if (expr.child(0) instanceof Literal) {
+            literal = (Literal) expr.child(0);
+            slotExpr = expr.child(1);
+        } else if (expr.child(1) instanceof Literal) {
+            literal = (Literal) expr.child(1);
+            slotExpr = expr.child(0);
+        } else {
+            return false;
+        }
+
+        if (!canExtractSlot(slotExpr)) {
+            return false;
+        }
+
+        return checkLiteral(expr, literal);
     }
+
+    @VisibleForTesting
+    protected static boolean checkLiteral(Expression expr, Literal literal) {
+        if (literal.isNullLiteral()) {
+            return false;
+        }
+        if (expr instanceof Multiply || expr instanceof Divide) {
+            if (literal.isZero()) {
+                return false;
+            }

Review Comment:
   Returning `true` for every non-null add/subtract literal also assumes the 
removed expression is total, but decimal arithmetic can still overflow. For 
example, with `d DECIMAL(38,0)`, `d + 99999999999999999999999999999999999999` 
keeps both operands as `DECIMAL(38,0)` after 
`processDecimalV3BinaryArithmetic()`, so this rule can replace `GROUP BY d, d + 
<literal>` with just `GROUP BY d`. The original group key evaluation raises 
`ARITHMETIC_OVERFLOW` under the default decimal overflow checking when `d` 
pushes the result past `DECIMAL(38,0)`, while the rewritten aggregate can avoid 
evaluating the removed key and return rows instead. Please only drop 
add/subtract keys when the arithmetic is proven not to overflow over the slot 
domain, or keep the removed key evaluated.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to