Re: [PR] feat: Add GetStructField expression [datafusion-comet]

via GitHub Fri, 26 Jul 2024 15:24:40 -0700


Kimahriman commented on code in PR #731:
URL: https://github.com/apache/datafusion-comet/pull/731#discussion_r1693649707



##########
spark/src/main/scala/org/apache/spark/sql/comet/CometRowToColumnarExec.scala:
##########
@@ -60,8 +62,17 @@ case class CometRowToColumnarExec(child: SparkPlan)
     val timeZoneId = conf.sessionLocalTimeZone
     val schema = child.schema
 
-    child
-      .execute()
+    val rdd: RDD[InternalRow] = if (child.supportsColumnar) {
+      child
+        .executeColumnar()
+        .mapPartitionsInternal { iter =>
+          iter.flatMap(_.rowIterator().asScala)
+        }
+    } else {
+      child.execute()
+    }
+
+    rdd

Review Comment:
   There might be a more efficient way than using a row iterator to write to 
the row based arrow writer, but as this is mostly for testing/fallback 
purposes, I didn't try to figure out a faster Spark vector to arrow vector 
approach. Hopefully someone is able to add complex type support to the Comet 
parquet reader. If not it could be worth thinking about using the Spark reader 
as a real fallback use case instead of just for testing purposes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat: Add GetStructField expression [datafusion-comet]

Reply via email to