Re: [PR] feat: add support for array_contains expression [datafusion-comet]

via GitHub Wed, 01 Jan 2025 12:33:24 -0800


andygrove commented on code in PR #1163:
URL: https://github.com/apache/datafusion-comet/pull/1163#discussion_r1900457315



##########
spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala:
##########
@@ -2390,4 +2390,14 @@ class CometExpressionSuite extends CometTestBase with 
AdaptiveSparkPlanHelper {
       checkSparkAnswer(df.select("arrUnsupportedArgs"))
     }
   }
+
+  test("array_contains") {
+    withTempDir { dir =>
+      val path = new Path(dir.toURI.toString, "test.parquet")
+      makeParquetFileAllTypes(path, dictionaryEnabled = false, n = 10000)
+      spark.read.parquet(path.toString).createOrReplaceTempView("t1");
+      checkSparkAnswerAndOperator(
+        spark.sql("SELECT array_contains(array(_2, _3, _4), _2) FROM t1"))

Review Comment:
   Spark analysis will fail if there are null literals in the SQL, but the test 
I suggested adding does not do that. It generates null values in the input to 
`array_append`. Here is the full test:
   
   ```scala
     test("array_contains") {
       withTempDir { dir =>
         val path = new Path(dir.toURI.toString, "test.parquet")
         makeParquetFileAllTypes(path, dictionaryEnabled = false, n = 10000)
         spark.read.parquet(path.toString).createOrReplaceTempView("t1");
         checkSparkAnswerAndOperator(
           spark.sql("SELECT array_contains(array(_2, _3, _4), _2) FROM t1"))
         checkSparkAnswerAndOperator(
           spark.sql("SELECT array_append((CASE WHEN _2 =_3 THEN array(_4) 
END), _4) FROM t1"));
       }
     }
   ```
   
   This does pass, so the behavior is correct already. We just need the test to 
prove that and avoid regressions in the future.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat: add support for array_contains expression [datafusion-comet]

Reply via email to