zclllyybb commented on issue #63824:
URL: https://github.com/apache/doris/issues/63824#issuecomment-4562380444

   I checked the live issue metadata and the Doris 4.1.1 source matching the 
reported Docker BE commit `b10073ad9ca17cd5685c4dd3b3ef650f256376d0`. There are 
no issue comments yet, and no labels are currently attached.
   
   Initial judgment: this should be treated as a valid high-severity BE 
correctness and stability bug in nested high-order array lambda execution. The 
report is not just an `array_agg` crash: the non-crashing examples show wrong 
lexical binding for nested lambdas, and the `array_agg` case appears to turn 
the same binding defect into an out-of-range/invalid column access.
   
   Code evidence from the affected tag:
   
   - In Nereids, `array_count(lambda, ...)` is rewritten as 
`array_count(array_map(lambda, ...))` (`ArrayCount.java`), so the reported 
query reaches BE as nested `array_map` execution.
   - `ArrayMapFunction::execute()` collects slot refs from `children[0]`, 
computes a `gap`, then recursively calls 
`_set_column_ref_column_id(children[0], gap)`.
   - `_collect_slot_ref_column_id()` and `_set_column_ref_column_id()` both 
recurse through all children. I do not see a scope boundary check for nested 
`VLambdaFunctionExpr`/nested lambda bodies.
   - `VColumnRef::set_gap()` only writes `_gap` when the existing value is 
zero. If an outer `array_map` traversal sets the gap on `VColumnRef` nodes that 
belong to an inner lambda, the inner lambda cannot reliably rebind them later.
   - `VColumnRef::execute_column()` then reads 
`block->get_by_position(_column_id + _gap)`. In 4.1.1 the const overload of 
`Block::get_by_position()` is unchecked, while `safe_get_by_position()` exists 
separately. This makes a wrong lambda gap able to propagate into an invalid 
column dereference, matching the reported stack through 
`PreparedFunctionImpl::default_implementation_for_constant_arguments()`, 
`VectorizedUtils::all_arguments_are_constant()`, and `is_column_const()`.
   
   So the likely fix should be in lambda scoping/binding, not only at the crash 
leaf. A bounds guard in `VColumnRef` would be useful as a defensive safety net, 
but it would not fix the wrong-result case:
   
   ```sql
   SELECT array_map(x -> array_count(y -> y = x, ['a']), ['b']);
   ```
   
   That query should return `[0]`; returning `[1]` indicates the inner lambda 
is not resolving the captured outer variable according to lexical scope.
   
   Suggested next steps:
   
   1. Reproduce on current `master` and the relevant 4.1 maintenance branch, 
then decide whether this needs a 4.1 backport.
   2. Make the lambda traversal in `ArrayMapFunction` scope-aware. The 
traversal for one lambda should not collect or mutate `VColumnRef` state under 
a nested lambda as if it belonged to the same lambda scope.
   3. Consider removing or isolating mutable execution-specific gap state from 
shared `VColumnRef` nodes, or clone/rebind the lambda body per scope.
   4. Add a defensive range check in `VColumnRef::execute_column()` / 
`execute_type()` or switch to `safe_get_by_position()` so an invalid expression 
binding returns a query error instead of terminating BE.
   5. Add regression coverage for both cases: the `array_agg` crash query 
should return `[1]`, and the literal nested lambda query above should return 
`[0]`.
   
   Missing information that would help close the loop, but is not required to 
classify this as a real bug: full BE log/core stack around the crash, `EXPLAIN 
VERBOSE` for the minimal query, and a confirmation on x86_64 or current 
master/branch-4.1.
   
   Breakwater-GitHub-Analysis-Slot: slot_c4bd4cb13bca
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to