kosiew commented on PR #17085:
URL: https://github.com/apache/datafusion/pull/17085#issuecomment-3356523033

   > Is this a band-aid fix? Is there a root cause we should be looking for 
instead?
   > There's a heavy emphasis on the word "synthesize" throughout this PR but I 
don't know what it means to "synthesize" a schema from literal expressions 🤔
   
   AggregateExprBuilder already captures a FieldRef for every argument 
(including literals) by calling each physical expression’s return_field during 
construction, so we retain the full Arrow metadata for those inputs in 
input_fields. 
   
   The new args_schema helper detects when the physical input schema is 
empty—something that legitimately happens when an aggregate is invoked with 
literals only because the child plan has no columns—and in that case 
reconstitutes a Schema from the stored input_fields so the accumulator can 
still see that metadata. 
   
   We then hand that schema to every AccumulatorArgs we build, so UDAFs observe 
the same field information whether their inputs were columns or literals. In 
other words, “synthesize” means “wrap the already-computed argument fields in a 
temporary Schema when the physical schema is empty”; there isn’t another layer 
hiding the real root cause.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to