crm26 opened a new pull request, #22542:
URL: https://github.com/apache/datafusion/pull/22542

   ## Which issue does this PR close?
   
   Partial of #21536 — `array_sum` (first of the array aggregates in the 
series).
   
   ## Rationale for this change
   
   Continues the per-function split sequence requested by @alamb on #21536. 
Four sibling PRs already merged: `cosine_distance` (#21542), `inner_product` 
(#21861), `array_normalize` (#22013), `array_scale` (#22466). `array_add` is in 
flight as #22459 by @SubhamSinghal.
   
   `array_sum` is the first of the three array-aggregate functions (sum, 
product, avg). Its semantics set the pattern for the other two aggregates.
   
   ## What changes are included in this PR?
   
   - New scalar UDF `array_sum(array)` in 
`datafusion/functions-nested/src/array_sum.rs`
   - Module wire-up + registration in `datafusion/functions-nested/src/lib.rs`
   - SLT tests at `datafusion/sqllogictest/test_files/array_sum.slt`
   - Auto-generated docs entry in 
`docs/source/user-guide/sql/scalar_functions.md`
   
   **Signature:** \`List/LargeList/FixedSizeList<numeric>\` in, \`Float64\` out 
(one scalar per row). Numeric inner types coerced to \`Float64\`.
   
   **NULL semantics — SQL aggregate convention (deliberate divergence from 
binary-op siblings):**
   - NULL row → NULL row out
   - NULL elements are **skipped**, matching PostgreSQL \`array_sum\`, DuckDB 
\`list_sum\`, Spark \`aggregate\`. Binary-op siblings (\`inner_product\`, 
\`array_normalize\`) null-row on NULL element because their per-element 
operation is undefined on NULL; aggregates conventionally skip NULLs in SQL.
   - All-NULL row → NULL out (matches \`SUM(...)\` over an all-NULL column)
   
   **Empty array → \`0.0\`** (additive identity; matches DuckDB \`list_sum([]) 
= 0\`).
   
   **Alias:** \`list_sum\` (matches the precedent of 
\`array_normalize\`→\`list_normalize\`, \`array_scale\`→\`list_scale\`).
   
   ## Are these changes tested?
   
   Yes. SLT covers happy paths, empty arrays, NULL row, NULL elements (mix + 
all-NULL), all list variants (List/LargeList/FixedSizeList), numeric coercion 
(Float32/Int64/integer literals), multi-row composition, error paths, return 
type, and the \`list_sum\` alias.
   
   ## Are there any user-facing changes?
   
   Yes — new SQL scalar function \`array_sum(array)\` and its alias 
\`list_sum\`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to