andygrove commented on code in PR #1163: URL: https://github.com/apache/datafusion-comet/pull/1163#discussion_r1900457315
########## spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala: ########## @@ -2390,4 +2390,14 @@ class CometExpressionSuite extends CometTestBase with AdaptiveSparkPlanHelper { checkSparkAnswer(df.select("arrUnsupportedArgs")) } } + + test("array_contains") { + withTempDir { dir => + val path = new Path(dir.toURI.toString, "test.parquet") + makeParquetFileAllTypes(path, dictionaryEnabled = false, n = 10000) + spark.read.parquet(path.toString).createOrReplaceTempView("t1"); + checkSparkAnswerAndOperator( + spark.sql("SELECT array_contains(array(_2, _3, _4), _2) FROM t1")) Review Comment: Spark analysis will fail if there are null literals in the SQL, but the test I suggested adding does not do that. It generates null values in the input to `array_append`. Here is the full test: ```scala test("array_contains") { withTempDir { dir => val path = new Path(dir.toURI.toString, "test.parquet") makeParquetFileAllTypes(path, dictionaryEnabled = false, n = 10000) spark.read.parquet(path.toString).createOrReplaceTempView("t1"); checkSparkAnswerAndOperator( spark.sql("SELECT array_contains(array(_2, _3, _4), _2) FROM t1")) checkSparkAnswerAndOperator( spark.sql("SELECT array_append((CASE WHEN _2 =_3 THEN array(_4) END), _4) FROM t1")); } } ``` This does pass, so the behavior is correct already. We just need the test to prove that and avoid regressions in the future. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org