Jefffrey commented on PR #17085:
URL: https://github.com/apache/datafusion/pull/17085#issuecomment-3379521826

   > Not at all 😄
   > 
   >     * `acc_args.schema.field(i)` — returns the raw Arrow `Field` from the 
(physical) input schema at position `i` (name, type, nullability, metadata 
exactly as in that schema).
   > 
   >     * `acc_args.exprs[i].return_field(&acc_args.schema)?` — asks the 
expression for the effective `FieldRef` for argument `i` given the full schema. 
It incorporates expression semantics (casts, literals, computed types, 
extension metadata, nullability changes, etc.) and returns an owned/Arc 
`FieldRef` (and can fail), not just a borrowed `&Field`.
   
   Thank you for the explanations, I'm still trying to wrap my head around all 
the parts involved here 😅 
   
   So I think my main confusion lies around the difference between the physical 
input schema, and the effective `FieldRef` argument; is there a reason we 
provide the ability to access both? This fix only synthesizes a schema if the 
physical schema is missing as you mentioned, but would it be incorrect 
behaviour to instead _always_ synthesize the schema from the physical schema 
(whether present or not)?
   
   If we look at scalar & window functions, I don't see them having equivalent 
logic around providing direct access to the physical schema, instead they 
provide methods to directly access `Field`s:
   
   - 
https://docs.rs/datafusion/50.0.0/datafusion/logical_expr/function/struct.ExpressionArgs.html
   - 
https://docs.rs/datafusion/50.0.0/datafusion/logical_expr/function/struct.PartitionEvaluatorArgs.html
   - 
https://docs.rs/datafusion/50.0.0/datafusion/logical_expr/struct.ScalarFunctionArgs.html
   
   I'm trying to understand why `AccumulatorArgs` seems to be the odd one out 
here; I'm sure there's some historical reason but the limited existing 
documentation on `AccumulatorArgs` makes it hard for me to reason that this fix 
is the correct approach 🤔 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to