comphead opened a new issue, #15162: URL: https://github.com/apache/datafusion/issues/15162
### Is your feature request related to a problem or challenge? In Apache DataFusion Comet during implementation to handle ARRAY types from Apache Spark it was found that the inner field hardcoded name is different is Arrow-rs and Apache Spark. The inner ListType field is hardcoded to `item` in https://github.com/apache/arrow-rs/blob/f4fde769ab6e1a9b75f890b7f8b47bc22800830b/arrow-schema/src/field.rs#L130 However it is a `element` for Apache Spark ``` scala> spark.sql("select array(1, 2, 3)").printSchema root |-- array(1, 2, 3): array (nullable = false) | |-- element: integer (containsNull = false) ``` Because of this discrepancy the schema failed when the record batch gets created ``` org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 309.0 failed 1 times, most recent failure: Lost task 0.0 in stage 309.0 (TID 797) (Mac-1741305812954.local executor driver): org.apache.comet.CometNativeException: Invalid argument error: column types must match schema types, expected List(Field { name: "element", data_type: Int8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }) but found List(Field { name: "item", data_type: Int8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }) at column index 0 ``` In DataFusion the List creation method `Field::new_list_field` with hardcoded field name is heavily used. The ticket idea is to find a way how to parametrize this. - Replace `Field::new_list_field` with `Field::new` which gives an opportunity to provide a custom name. However those methods are often called from the context where is no `SessionContext` exist and thus there is no possibility to access to config variable where new name can be parametrized - Make the name parametrized in arrow-rs, unfortunately there is no external config in arrow-rs. It is possible to leverage ENV vars but this is usually not a good way to go - Change `RecordBatch::try_new` and for ListTypes avoid checking inner naming just check the inner datatype Related https://github.com/apache/datafusion-comet/pull/1456 ### Describe the solution you'd like _No response_ ### Describe alternatives you've considered _No response_ ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org