alamb commented on issue #15162: URL: https://github.com/apache/datafusion/issues/15162#issuecomment-2722640911
> IMO if spark has specific schema requirements, I'm not sure I see a way to avoid coercing at the boundary, it will be an indefinite game of wack-a-mole otherwise (not just for lists). So the proposal as I understand it is to implement something like the follwing function that is called on all batches prior to returning to spark ```rust /// Converts the schema of `batch` to one suitable for Spark's conventions /// /// Note only converts the schema, no data is copied /// /// Transformations applied: /// * The name of the fields in `DataType::List` are changed to "element" /// .... fn coerce_schema_for_spark(batch: RecordBatch) -> Result<RecordBatch> { ... } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org