Re: [I] Parametrize ListArray inner field [datafusion]

via GitHub Thu, 13 Mar 2025 16:06:33 -0700


alamb commented on issue #15162:
URL: https://github.com/apache/datafusion/issues/15162#issuecomment-2722640911


   > IMO if spark has specific schema requirements, I'm not sure I see a way to 
avoid coercing at the boundary, it will be an indefinite game of wack-a-mole 
otherwise (not just for lists).
   
   So the proposal as I understand it is to implement something like the 
follwing function that is called on all batches prior to returning to spark
   
   ```rust
   /// Converts the schema of `batch` to one suitable for Spark's conventions
   ///
   /// Note only converts the schema, no data is copied
   ///
   /// Transformations applied:
   /// * The name of the fields in `DataType::List` are changed to "element"
   /// ....
   fn coerce_schema_for_spark(batch: RecordBatch) -> Result<RecordBatch> {
   ...
   }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] Parametrize ListArray inner field [datafusion]

Reply via email to