liamphmurphy opened a new issue, #15338: URL: https://github.com/apache/datafusion/issues/15338
### Describe the bug This bug for me originated when encountering schema evolutions on Delta tables using the `delta-rs` library. Whenever a schema evolution occurred on my table that contains a field with a list of structs, Datafusion is returning this error: ``` This feature is not implemented: Unsupported CAST from Struct([Field { name: "properties", data_type: Struct([Field { name: "someNewField", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "fields", data_type: List(Field { name: "item", data_type: Struct([Field { name: "messageId", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]) to Struct([Field { name: "properties", data_type: Struct([Field { name: "fields", data_type: List(Field { name: "element", data_type: Struct([Field { name: "messageId", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }), nullable: true, dict_id: 0, dict_is_ordere d: false, metadata: {} }, Field { name: "someNewField", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]) ``` ### To Reproduce Below is the python code using delta-rs (which is currently on Datafusion 46) that shows this error ```python import pyarrow as pa from deltalake import write_deltalake # Define the path for the Delta table delta_table_path = "./datafusion-repro-test-table" # Define the data for the first write data_first_write = [ { "uid": "ws_2", "event": { "properties": { "fields": [ { "messageId": "veniam sed et elit adipisicing" } ], }, } } ] schema = pa.schema([ pa.field("uid", pa.string()), pa.field("event", pa.struct([ pa.field("properties", pa.struct([ pa.field("fields", pa.list_(pa.struct([ pa.field("messageId", pa.string()), ]))), ])), ])), ]) print(schema) first_write = pa.Table.from_pylist(data_first_write, schema=schema) # Write data to Delta table for the first write write_deltalake(delta_table_path, first_write, mode="append", engine="rust", schema_mode="merge") #### NOW FOR THE SECOND WRITE THAT BREAKS #### data_second_write = [ { "uid": "ws_2", "event": { "properties": { "someNewField": "test-value", # New field "fields": [ { "messageId": "veniam sed et elit adipisicing" } ], }, } } ] second_schema = pa.schema([ pa.field("uid", pa.string()), pa.field("event", pa.struct([ pa.field("properties", pa.struct([ pa.field("someNewField", pa.string()), # New field pa.field("fields", pa.list_(pa.struct([ pa.field("messageId", pa.string()), ]))), ])), ])), ]) second_write = pa.Table.from_pylist(data_second_write, schema=second_schema) # Write data to Delta table for the second write write_deltalake(delta_table_path, second_write, mode="append", engine="rust", schema_mode="merge") ``` ### Expected behavior Datafusion would support casting a schema when said schema contains a list of structs. ### Additional context Originating bug report in delta-rs: https://github.com/delta-io/delta-rs/issues/3339 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org