mbutrovich commented on code in PR #1226: URL: https://github.com/apache/datafusion-comet/pull/1226#discussion_r1905609592
########## native/spark-expr/src/cast.rs: ########## @@ -817,17 +818,28 @@ fn cast_struct_to_struct( cast_options: &SparkCastOptions, ) -> DataFusionResult<ArrayRef> { match (from_type, to_type) { - (DataType::Struct(_), DataType::Struct(to_fields)) => { - let mut cast_fields: Vec<(Arc<Field>, ArrayRef)> = Vec::with_capacity(to_fields.len()); + (DataType::Struct(from_fields), DataType::Struct(to_fields)) => { + // TODO some of this logic may be specific to converting Parquet to Spark + let mut field_name_to_index_map = HashMap::new(); Review Comment: I'm wondering if we'll end up adding a cache in the SchemaAdapter logic for these sorts of maps, since it looks like this would be generated for each batch? ########## native/spark-expr/src/schema_adapter.rs: ########## @@ -321,10 +322,26 @@ fn cast_supported(from_type: &DataType, to_type: &DataType, options: &SparkCastO (Timestamp(_, Some(_)), _) => can_cast_from_timestamp(to_type, options), (Utf8 | LargeUtf8, _) => can_cast_from_string(to_type, options), (_, Utf8 | LargeUtf8) => can_cast_to_string(from_type, options), - (Struct(from_fields), Struct(to_fields)) => from_fields - .iter() - .zip(to_fields.iter()) - .all(|(a, b)| cast_supported(a.data_type(), b.data_type(), options)), + (Struct(from_fields), Struct(to_fields)) => { + // TODO some of this logic may be specific to converting Parquet to Spark + let mut field_types = HashMap::new(); + for field in from_fields { + field_types.insert(field.name(), field.data_type()); Review Comment: Same caching comment as above, mostly as a reminder to our future selves. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org