morrySnow commented on code in PR #64820:
URL: https://github.com/apache/doris/pull/64820#discussion_r3519122664


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/functions/scalar/Ipv4StringToNumOrDefault.java:
##########
@@ -36,7 +37,7 @@
  * scalar function ipv4_string_to_num_or_default
  */
 public class Ipv4StringToNumOrDefault extends ScalarFunction

Review Comment:
   **Redundant `NullToNonNullFunction`:** `Ipv4StringToNumOrDefault` already 
implements `AlwaysNotNullable` and has input slots. The 
`canConvertNullToNonNull()` fallback branch (`e instanceof AlwaysNotNullable && 
!e.getInputSlots().isEmpty()`) already catches this class. Adding 
`NullToNonNullFunction` here is decorative — it provides no additional 
filtering.
   
   Consider either: (1) dropping `NullToNonNullFunction` from all 
`AlwaysNotNullable`-with-input classes and relying on the fallback alone, or 
(2) consistently marking ALL `AlwaysNotNullable`-with-input classes (including 
`IsTrue`, `IsFalse`, `NonNullable`, `Array`). The current selective marking 
creates confusion about which classes need the marker.



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/functions/scalar/NotNullOrEmpty.java:
##########
@@ -36,7 +37,7 @@
  * ScalarFunction 'not_null_or_empty'. This class is generated by 
GenerateFunction.
  */
 public class NotNullOrEmpty extends ScalarFunction

Review Comment:
   Same redundancy as the `OrDefault` classes: `NotNullOrEmpty` implements both 
`AlwaysNotNullable` and `NullToNonNullFunction`, but the 
`canConvertNullToNonNull()` fallback already covers it. If the intent is 
documentation, consider using Javadoc instead of a marker interface.



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/NullToNonNullFunction.java:
##########
@@ -0,0 +1,65 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.nereids.trees.expressions;
+
+import org.apache.doris.nereids.trees.expressions.functions.AlwaysNotNullable;
+import 
org.apache.doris.nereids.trees.expressions.functions.agg.AggregateFunction;
+
+/**
+ * Marker interface for expressions that can convert NULL input into a 
non-NULL output.
+ *
+ * For example: Coalesce(NULL, 2) → 2, Nvl(NULL, 0) → 0, NullOrEmpty(NULL) → 
true.
+ *
+ * This is significant for outer-join push-down safety: when an aggregate 
function contains
+ * a NullToNonNull expression wrapping a column from the nullable side of an 
outer join,
+ * the aggregation must NOT be pushed down. Null-extended rows (produced by 
the join for
+ * unmatched rows) have NULL for all nullable-side columns. The NullToNonNull 
expression
+ * would convert those NULLs to non-NULL values, and the pre-aggregation would 
miss those
+ * contributions because null-extended rows do not exist in the base table.
+ *
+ * <p>Note: {@link AlwaysNotNullable} expressions with input slots (e.g. 
Array, JsonArray,
+ * JsonObject, CreateStruct, CreateMap) are also blocked from being pushed to 
the nullable
+ * side of outer joins via a separate check in {@link 
#canConvertNullToNonNull(Expression)}.
+ */
+public interface NullToNonNullFunction {
+
+    /**
+     * Check whether an expression can convert NULL input to non-NULL output.

Review Comment:
   **Fragile `AggregateFunction` exclusion:** The `if (e instanceof 
AggregateFunction) return false` guard exists because `anyMatch` traverses into 
the aggregate function node itself (e.g., `Count extends 
NotNullableAggregateFunction extends AggregateFunction implements 
AlwaysNotNullable`), which would incorrectly match. This is a bandaid for the 
wrong traversal depth — a cleaner fix would be to change the call site to 
`aggregateFunc.children().anyMatch(e -> canConvertNullToNonNull((Expression) 
e))` instead of `aggregateFunc.anyMatch(...)`, eliminating the need for 
type-based exclusion entirely.
   
   This matters because: (a) if a new `AggregateFunction` subclass also 
legitimately implements `NullToNonNullFunction` in the future, it would be 
silently skipped; (b) the exclusion hardcodes knowledge about which expression 
types are "containers" into the utility method.



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/eageraggregation/PushDownAggregation.java:
##########
@@ -265,9 +266,7 @@ public Plan visitLogicalAggregate(LogicalAggregate<? 
extends Plan> agg, JobConte
                 }
                 LogicalAggregate<Plan> eagerAgg =
                         agg.withAggOutputChild(newOutputExpressions, child);

Review Comment:
   **`normalizeAgg` removal and `normalized` flag propagation:** The eager 
aggregate `eagerAgg = agg.withAggOutputChild(newOutputExpressions, child)` 
inherits the `normalized` flag from `agg` (typically `true` at this point in 
the pipeline). The global `NormalizeAggregate` rule at 
`NormalizeAggregate.java:124` only fires 
`whenNot(LogicalAggregate::isNormalized)`, so it will **not** re-normalize this 
aggregate. The new aggregate has different output expressions and a different 
child — if any of these require normalization (e.g., complex expressions in 
aggregate function arguments that should be projected out), they will remain 
unnormalized.
   
   Please verify that the new output expressions are always simple enough to 
not need normalization, or add a re-normalization step specifically for the 
newly constructed aggregate.



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/functions/scalar/NullOrEmpty.java:
##########
@@ -36,7 +37,7 @@
  * ScalarFunction 'null_or_empty'. This class is generated by GenerateFunction.
  */
 public class NullOrEmpty extends ScalarFunction

Review Comment:
   Same redundancy — `NullOrEmpty` implements `AlwaysNotNullable` so the second 
branch of `canConvertNullToNonNull()` catches it. The `NullToNonNullFunction` 
marker adds no behavioral difference here.



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/NullToNonNullFunction.java:
##########
@@ -0,0 +1,65 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.nereids.trees.expressions;
+
+import org.apache.doris.nereids.trees.expressions.functions.AlwaysNotNullable;
+import 
org.apache.doris.nereids.trees.expressions.functions.agg.AggregateFunction;
+
+/**
+ * Marker interface for expressions that can convert NULL input into a 
non-NULL output.
+ *
+ * For example: Coalesce(NULL, 2) → 2, Nvl(NULL, 0) → 0, NullOrEmpty(NULL) → 
true.
+ *
+ * This is significant for outer-join push-down safety: when an aggregate 
function contains
+ * a NullToNonNull expression wrapping a column from the nullable side of an 
outer join,
+ * the aggregation must NOT be pushed down. Null-extended rows (produced by 
the join for
+ * unmatched rows) have NULL for all nullable-side columns. The NullToNonNull 
expression
+ * would convert those NULLs to non-NULL values, and the pre-aggregation would 
miss those
+ * contributions because null-extended rows do not exist in the base table.
+ *
+ * <p>Note: {@link AlwaysNotNullable} expressions with input slots (e.g. 
Array, JsonArray,
+ * JsonObject, CreateStruct, CreateMap) are also blocked from being pushed to 
the nullable
+ * side of outer joins via a separate check in {@link 
#canConvertNullToNonNull(Expression)}.
+ */
+public interface NullToNonNullFunction {
+
+    /**
+     * Check whether an expression can convert NULL input to non-NULL output.
+     * This covers both {@link NullToNonNullFunction} (e.g. Coalesce, Nvl, 
NullOrEmpty)
+     * and {@link AlwaysNotNullable} expressions with input slots (e.g. Array, 
JsonArray,

Review Comment:
   **Missing `IsTrue` and `IsFalse` from the set of marked classes:** `IsNull` 
is marked `NullToNonNullFunction` in this PR, but `IsTrue` and `IsFalse` (which 
also implement `AlwaysNotNullable` and convert NULL input to non-NULL output: 
`NULL IS TRUE → FALSE`, `NULL IS FALSE → FALSE`) are not. All three have 
identical null-to-non-null semantics. While the `AlwaysNotNullable` fallback 
catches them, the inconsistency with `IsNull` being explicitly marked suggests 
the selection is arbitrary.
   
   Also: `NonNullable` (Javadoc: "change nullable input col to non_nullable 
col") is the canonical NullToNonNullFunction but is not marked.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to