github-actions[bot] commented on code in PR #64563:
URL: https://github.com/apache/doris/pull/64563#discussion_r3473847678


##########
be/src/exprs/lambda_function/varray_sort_function.cpp:
##########
@@ -221,6 +250,146 @@ class ArraySortFunction : public LambdaFunction {
         return Status::OK();
     }
 
+private:
+    Status _set_comparator_argument_gap(const VExprSPtr& expr,
+                                        const std::vector<std::string>* 
argument_names) const {
+        if (expr->is_column_ref()) {
+            auto* ref = static_cast<VColumnRef*>(expr.get());
+            RETURN_IF_ERROR(_validate_comparator_argument_ref(*ref, 
argument_names));
+            ref->set_gap(0);
+            return Status::OK();
+        }
+
+        if (expr->is_slot_ref() || expr->is_virtual_slot_ref()) {
+            return Status::NotSupported(
+                    "array_sort comparator only supports its own lambda 
arguments, but found "
+                    "captured slot ref '{}'",
+                    expr->expr_name());
+        }
+
+        if (_is_lambda_call_with_lambda_expr(expr)) {
+            // array_sort comparator arguments live in a position-based, 
comparator-local frame
+            // that is invisible to nested lambda frames. Reject unsupported 
nested captures during
+            // prepare, otherwise execution would later fail with an internal 
missing-column error.
+            // For example, array_sort((x, y) -> array_map(z -> z + x, 
nested_arr), arr) is
+            // rejected because the inner array_map lambda captures the 
comparator-local x; while
+            // array_sort((x, y) -> array_map(x -> x + 1, nested_arr), arr) is 
still valid because
+            // the inner x is array_map's own argument and shadows the 
comparator argument.
+            
RETURN_IF_ERROR(_reject_nested_lambda_capture_of_comparator_argument(
+                    assert_cast<const 
VLambdaFunctionExpr*>(expr->children()[0].get()),
+                    argument_names));
+            for (int i = 1; i < expr->children().size(); ++i) {
+                
RETURN_IF_ERROR(_set_comparator_argument_gap(expr->children()[i], 
argument_names));
+            }
+            return Status::OK();
+        }
+
+        for (const auto& child : expr->children()) {
+            RETURN_IF_ERROR(_set_comparator_argument_gap(child, 
argument_names));
+        }
+        return Status::OK();
+    }
+
+    Status _reject_nested_lambda_capture_of_comparator_argument(
+            const VLambdaFunctionExpr* lambda_expr,
+            const std::vector<std::string>* comparator_argument_names) const {
+        if (!lambda_expr->has_argument_names()) {
+            return Status::InternalError(
+                    "Cannot validate nested lambda capture in array_sort 
comparator without lambda "
+                    "metadata");
+        }
+        return 
_reject_nested_lambda_capture_of_comparator_argument(lambda_expr->get_child(0),
+                                                                    
comparator_argument_names,
+                                                                    
lambda_expr->argument_names());
+    }
+
+    Status _reject_nested_lambda_capture_of_comparator_argument(
+            const VExprSPtr& expr, const std::vector<std::string>* 
comparator_argument_names,
+            const std::vector<std::string>& in_scope_lambda_argument_names) 
const {
+        // Names in in_scope_lambda_argument_names are declared by the nested 
lambda scopes that
+        // enclose expr. They can legally shadow array_sort comparator 
argument names, so a
+        // ColumnRef matching one of these names should be treated as a local 
nested-lambda
+        // argument instead of an unsupported capture from the array_sort 
comparator.
+        if (expr->is_column_ref()) {
+            if (std::ranges::find(in_scope_lambda_argument_names, 
expr->expr_name()) !=
+                in_scope_lambda_argument_names.end()) {

Review Comment:
   This local-name check still misses captures of the second comparator 
argument. `visitArraySort()` builds the second comparator ref by cloning the 
first one and changing only `column_id` to 1, so a body reference to comparator 
`y` can arrive here as `ColumnRef(name="x", column_id=1)`. If an inner lambda 
also declares `x`, this branch returns `OK` before looking at the column id, 
and runtime name resolution binds the ref to the inner `x` frame. A shape like 
`array_sort((x, y) -> array_map(x -> x + y, [1, 2]), arr)` is then accepted and 
evaluates the capture as `x + x` instead of rejecting the unsupported 
comparator capture. Please either make this validation consider comparator 
column ids, or serialize the second comparator ref with its own name before 
relying on name-based shadowing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to