andygrove opened a new pull request, #4436: URL: https://github.com/apache/datafusion-comet/pull/4436
## Which issue does this PR close? N/A. Autonomous audit pass. ## Rationale for this change Audit of the `any` expression against Spark 3.4.3, 3.5.8, and 4.0.1. `any` is registered in Spark's `FunctionRegistry` as a SQL alias of `BoolOr`, which extends `RuntimeReplaceableAggregate` with `replacement = Max(child)`. The Catalyst analyzer rewrites `any(x)` (and the siblings `some(x)` and `bool_or(x)`) to `max(x)` before Comet sees the plan, so the expression is served by `CometMax` on a `BooleanType` column. `boolAggregates.scala` is byte-identical across the three audited Spark versions, so no shim is required. The only finding was a test-coverage gap: the existing `test bool_and/bool_or` in `CometAggregateSuite` covers a single dataset of four rows with no NULLs, no empty group, no group-by, and no `any`/`some` alias. ## What changes are included in this PR? - Audit sub-bullets under `any` in `docs/source/contributor-guide/spark_expressions_support.md` - New SQL-file test `spark/src/test/resources/sql-tests/expressions/aggregate/any.sql` covering column, literal, NULL, empty, all-NULL, group-by, and HAVING inputs for `any`, `some`, and `bool_or` ## How are these changes tested? ``` ./mvnw test -Dsuites="org.apache.comet.CometSqlFileTestSuite any" -Dtest=none -Dscalastyle.skip=true ``` Passes locally (1 test, 0 failures). Scaffolded by the `audit-comet-expression-autonomous` skill. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
