Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]

via GitHub Wed, 03 Jul 2024 10:04:31 -0700


viirya commented on code in PR #613:
URL: https://github.com/apache/datafusion-comet/pull/613#discussion_r1664503124



##########
spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala:
##########
@@ -158,6 +158,11 @@ class CometTPCDSQuerySuite
     conf.set(CometConf.COMET_EXEC_ALL_OPERATOR_ENABLED.key, "true")
     conf.set(CometConf.COMET_EXEC_SHUFFLE_ENABLED.key, "true")
     conf.set(CometConf.COMET_MEMORY_OVERHEAD.key, "20g")
+    conf.set(CometConf.COMET_SHUFFLE_ENFORCE_MODE_ENABLED.key, "true")
+    conf.set("spark.sql.adaptive.coalescePartitions.enabled", "true")
+    // Disable `CometTakeOrderedAndProjectExec` because it doesn't produce 
same output order
+    // as Spark.
+    conf.set("spark.comet.exec.takeOrderedAndProjectExec.disabled", "true")

Review Comment:
   I think these tests should be deterministic (that's why we can compare it 
with golden files). I'm not sure why `CometTakeOrderedAndProjectExec` returns 
out of order results.
   
   The results are same, but the orders are different to Spark. I suspect that 
it is something related to sorting part in `CometTakeOrderedAndProjectExec`. As 
the sorting is delegated to DataFusion's sort/top k operators, I need to 
investigate particularly for the failed query (e.g., q6).
   
   It is not related to the change here, though. So I will investigate it 
separately in follow PRs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat: Use unified allocator for execution iterators [datafusion-comet]

Reply via email to