Kimahriman commented on code in PR #731:
URL: https://github.com/apache/datafusion-comet/pull/731#discussion_r1695100893


##########
native/spark-expr/src/structs.rs:
##########
@@ -125,3 +125,103 @@ impl PartialEq<dyn Any> for CreateNamedStruct {
             .unwrap_or(false)
     }
 }
+
+#[derive(Debug, Hash)]
+pub struct GetStructField {
+    child: Arc<dyn PhysicalExpr>,
+    ordinal: usize,
+}
+
+impl GetStructField {
+    pub fn new(child: Arc<dyn PhysicalExpr>, ordinal: usize) -> Self {
+        Self { child, ordinal }
+    }
+
+    fn child_field(&self, input_schema: &Schema) -> 
DataFusionResult<Arc<Field>> {
+        match self.child.data_type(input_schema)? {
+            DataType::Struct(fields) => Ok(fields[self.ordinal].clone()),
+            data_type => Err(DataFusionError::Plan(format!(
+                "Expect struct field, got {:?}",
+                data_type
+            ))),
+        }
+    }
+}
+
+impl PhysicalExpr for GetStructField {

Review Comment:
   Don't know enough about DataFusion to really know what the difference is. 
Just on the Spark side, UDFs are usually slightly less performant, so if you 
don't have to do a UDF you're usually better off. DataFusion does have a 
[get_field](https://github.com/apache/datafusion/blob/main/datafusion/functions/src/core/getfield.rs)
 ScalarUDF already it looks like, but that's by name and not index, and there 
seems like a lot more ceremony about checking all the input types, vs the 
PhysicalExpr is more tailored to what we already know about the input data.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to