andygrove opened a new pull request, #4439:
URL: https://github.com/apache/datafusion-comet/pull/4439

   ## Which issue does this PR close?
   
   N/A. Autonomous audit pass.
   
   ## Rationale for this change
   
   Audit of the `Average` (`avg`) aggregate expression against Spark 3.4.3, 
3.5.8, and 4.0.1. The aggregate logic is identical across all three versions 
(4.0.1 only changes a `QueryContext` import path). The Comet serde and the Rust 
`Avg` / `AvgDecimal` accumulators correctly handle numeric and decimal inputs, 
including ANSI-mode decimal overflow. The audit found one inaccurate 
user-facing string in the serde and several uncovered edge cases that are now 
exercised.
   
   ## What changes are included in this PR?
   
   - Audit sub-bullets in `spark_expressions_support.md` recording dates and 
the per-version finding for 3.4.3, 3.5.8, and 4.0.1.
   - Corrected `CometAverage.getIncompatibleReasons` text. The previous text 
claimed "Falls back to Spark in ANSI mode. Supports all numeric inputs except 
decimal types", neither of which is accurate: ANSI mode is wired through to the 
native `AvgDecimal` accumulator, and decimal inputs are supported via 
`avgDataTypeSupported`. The new text describes the real caveat: Comet falls 
back to Spark for `YearMonthIntervalType` and `DayTimeIntervalType` inputs 
(which Spark supports since 3.4).
   - Expanded `expressions/aggregate/avg.sql` with new SQL test cases: 
single-row group; tinyint and smallint inputs; all-NULL groups; empty input; 
double NaN / +Infinity / -Infinity mixes; Long boundary values; negative-only 
inputs; decimal at precision 20; cross-check against `count`.
   
   ## How are these changes tested?
   
   - `./mvnw test -DwildcardSuites=CometSqlFileTestSuite 
-Dsuites="org.apache.comet.CometSqlFileTestSuite avg" -Dtest=none` (passes 
locally; all new queries match Spark)
   
   Scaffolded by the `audit-comet-expression-autonomous` skill.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to