[I] Bug triage results: 2026-05-04 [datafusion-comet]

via GitHub Mon, 04 May 2026 06:22:41 -0700


andygrove opened a new issue, #4203:
URL: https://github.com/apache/datafusion-comet/issues/4203


   ## Triage summary for 2026-05-04
   
   Triaged **29** open issues that carried `requires-triage` (31 total in the 
queue; 2 skipped, see below). Labels have already been applied; please 
spot-check below and close this issue when satisfied. Any individual relabel 
can be done directly on the affected issue.
   
   Triage criteria come from 
[`docs/source/contributor-guide/bug_triage.md`](../blob/main/docs/source/contributor-guide/bug_triage.md).
 Most issues in this batch trace back to the Spark 4.1.1 enablement work 
(#4098), so a number of correctness regressions are flagged as 
`priority:critical` even though the affected version is not yet a supported 
target.
   
   ### Counts by priority applied
   
   - `priority:critical`: 5
   - `priority:high`: 0
   - `priority:medium`: 14
   - `priority:low`: 10
   
   ### Triaged
   
   #### priority:critical
   
   - Spark 4.1: bloom filter result mismatch (might_contain returns wrong 
answers) ([#4193](https://github.com/apache/datafusion-comet/issues/4193))
     - Area labels: `area:expressions`, `spark 4`
     - Rationale: silent wrong `might_contain` results for the same input; 
correctness over crashes per the guide.
   - Spark 4.1: native parquet reader returns wrong rows for user-defined 
struct schema ([#4192](https://github.com/apache/datafusion-comet/issues/4192))
     - Area labels: `area:scan`, `spark 4`, `native_datafusion`, 
`native_iceberg_compat`
     - Rationale: native reader returns different rows than Spark for `c0: 
struct<y:int,x:string>`; silent wrong results.
   - Native scan path doesn't honour Parquet field-ID matching when 
`spark.sql.parquet.fieldId.read.enabled=true` 
([#4189](https://github.com/apache/datafusion-comet/issues/4189))
     - Area labels: `area:scan`, `native_datafusion`
     - Rationale: by-name resolution silently returns wrong data when names and 
IDs disagree (Delta column-mapping `id` mode).
   - SPARK-53968 SQLViewSuite: decimal arithmetic returns ~10x smaller values 
through view CTE on Spark 4.1.1 
([#4124](https://github.com/apache/datafusion-comet/issues/4124))
     - Area labels: `area:expressions`, `spark 4`
     - Rationale: `unit_price + COALESCE(shipping_price, 0)` returns 
10x-too-small decimals; silent wrong results in arithmetic.
   - EXCEPT ALL / INTERSECT ALL with GROUP BY return incorrect results on Spark 
4.1.1 ([#4122](https://github.com/apache/datafusion-comet/issues/4122))
     - Area labels: `area:aggregation`, `spark 4`
     - Rationale: extra/missing rows for both EXCEPT ALL and INTERSECT ALL 
queries; silent wrong results.
   
   #### priority:medium
   
   - Spark 4.1 NullType parquet: parquet-rs rejects BOOLEAN + Unknown logical 
type ([#4199](https://github.com/apache/datafusion-comet/issues/4199))
     - Area labels: `area:scan`, `spark 4`
     - Rationale: hard read error on `NullType` parquet columns; functional gap 
with workaround (test currently gated).
   - Audit Spark SQL configs that affect query semantics and ensure Comet 
honors them (or falls back) 
([#4180](https://github.com/apache/datafusion-comet/issues/4180))
     - Area labels: `area:expressions`, `area:scan`
     - Rationale: broad correctness audit (rebase modes, time parser policy, 
etc.); high-leverage but no specific incident.
   - Add support for scalar UDFs that operate on Arrow data 
([#4177](https://github.com/apache/datafusion-comet/issues/4177))
     - Area labels: `area:expressions`
     - Rationale: feature gap; users have row-based UDFs as a workaround.
   - Make JVM-scalar-UDF dispatch responsive to task cancellation 
([#4175](https://github.com/apache/datafusion-comet/issues/4175))
     - Area labels: `area:expressions`
     - Rationale: cancellation responsiveness gap; functional bug with a 
behavioural workaround (tasks still complete).
   - Register CometArrowAllocator as a Spark MemoryConsumer for JVM-UDF 
dispatch ([#4174](https://github.com/apache/datafusion-comet/issues/4174))
     - Area labels: `area:expressions`, `area:ffi`
     - Rationale: off-heap memory invisible to Spark accounting; 
production-safety concern, no immediate breakage.
   - Tighten CometUDF API with input/return type validation at registration 
([#4173](https://github.com/apache/datafusion-comet/issues/4173))
     - Area labels: `area:expressions`
     - Rationale: hard JVM crash on type mismatch; medium because surface is 
currently dev-only and crash is loud.
   - JNI local references accumulate across executor JVM lifetime in native 
call sites ([#4172](https://github.com/apache/datafusion-comet/issues/4172))
     - Area labels: `area:ffi`
     - Rationale: slow leak that may eventually OOM under heavy load; 
functional bug with no immediate user impact.
   - Track parse_url Spark compatibility work 
([#4156](https://github.com/apache/datafusion-comet/issues/4156))
     - Area labels: `area:expressions`
     - Rationale: incompatibility tracker; expression marked `Incompatible` and 
falls back by default, so workaround exists.
   - url_decode: try_url_decode (Spark 4.0) errors instead of returning NULL on 
malformed input 
([#4155](https://github.com/apache/datafusion-comet/issues/4155))
     - Area labels: `area:expressions`, `spark 4`
     - Rationale: visible error divergence (not silent wrong results); 
functional bug with a known fix path.
   - Support expressions already implemented in datafusion-spark crate 
([#4150](https://github.com/apache/datafusion-comet/issues/4150))
     - Area labels: `area:expressions`
     - Rationale: 29 expressions are already native but not wired up; 
functional gap with Spark fallback as workaround.
   - Track Spark 4.2 test failures 
([#4142](https://github.com/apache/datafusion-comet/issues/4142))
     - Area labels: `area:expressions`, `spark 4`
     - Rationale: tracking issue for several 4.2 SQL/test failures (ANSI 
overflow text, INT96 path, Jetty class verifier); functional gaps for the 
upcoming version.
   - Comet native scan returns wrong schema for missing struct fields in 
Parquet ([#4136](https://github.com/apache/datafusion-comet/issues/4136))
     - Area labels: `area:scan`, `spark 4`, `native_datafusion`
     - Rationale: schema divergence (full struct + nulls vs `struct<>`) on new 
SPARK-53535 / SPARK-54220 4.1 tests; functional gap with workaround.
   - Comet native sort lacks row-format support for Struct(Map(...)) sort keys 
([#4123](https://github.com/apache/datafusion-comet/issues/4123))
     - Area labels: `area:expressions`, `spark 4`
     - Rationale: native sort errors on Struct(Map) sort keys; should fall 
back. Workaround: file disabled in 4.1.1 diff.
   - Comet native scan rejects invalid UTF-8 byte sequences in STRING column 
(hll.sql on Spark 4.1) 
([#4121](https://github.com/apache/datafusion-comet/issues/4121))
     - Area labels: `area:scan`, `spark 4`
     - Rationale: visible error on invalid-UTF-8 STRING columns; functional gap 
with workaround (file disabled).
   
   #### priority:low
   
   - macOS aarch64 flake: SIGBUS in _pthread_tsd_cleanup after 
ParquetReadFromFakeHadoopFsSuite 
([#4200](https://github.com/apache/datafusion-comet/issues/4200))
     - Area labels: `area:ci`, `area:scan`
     - Rationale: macOS-only CI flake after a single test, suspected stale TSD 
destructor in hdfs-opendal; matches the guide's "CI flakes" tier.
   - Spark 4.1: native_datafusion bytesRead task metric off by 6-14x vs Spark 
([#4194](https://github.com/apache/datafusion-comet/issues/4194))
     - Area labels: `area:scan`, `spark 4`
     - Rationale: metric reporting divergence in three CometTaskMetricsSuite 
tests; no functional or correctness impact.
   - Set up process for auditing all Spark commits to assess impact on Comet 
([#4188](https://github.com/apache/datafusion-comet/issues/4188))
     - Area labels: none
     - Rationale: process / tooling enhancement; fits the guide's "tooling" 
tier.
   - Add metrics and logging to JVM-scalar-UDF dispatch path 
([#4176](https://github.com/apache/datafusion-comet/issues/4176))
     - Area labels: `area:expressions`
     - Rationale: observability enhancement on a prototype branch; no 
user-visible bug.
   - Pending PRs badge showing failed PRs 
([#4160](https://github.com/apache/datafusion-comet/issues/4160))
     - Area labels: `area:ci`
     - Rationale: dashboard / cosmetic; no user-visible runtime impact.
   - Improve serde framework handling of `StaticInvoke` 
([#4151](https://github.com/apache/datafusion-comet/issues/4151))
     - Area labels: `area:expressions`
     - Rationale: internal refactor; no user-visible bug.
   - AQE DPP SAB wrapping skipped when V2 scan is wrapped in 
`CometSparkToColumnarExec` 
([#4145](https://github.com/apache/datafusion-comet/issues/4145))
     - Area labels: `area:scan`
     - Rationale: real bug in the rule, but no current user path hits it (V2 
Parquet AQE DPP not yet supported in Spark).
   - Upgrade to Spark 4.2.0-preview5 
([#4143](https://github.com/apache/datafusion-comet/issues/4143))
     - Area labels: `spark 4`
     - Rationale: routine version bump; tracking/process work.
   - CachedBatchSerializerNoUnwrapSuite: Comet replaces WholeStageCodegenExec 
([#4137](https://github.com/apache/datafusion-comet/issues/4137))
     - Area labels: `spark sql tests`
     - Rationale: 4.1.1 SQL test failure caused by Comet plan replacement; 
test-only.
   - Add support for Spark 4.2.0-preview4 
([#4113](https://github.com/apache/datafusion-comet/issues/4113))
     - Area labels: `spark 4`
     - Rationale: version bump; likely subsumed by the preview5 issue (#4143).
   
   ### Escalations to consider
   
   - Spark 4.1 critical cluster (#4193 bloom filter, #4192 parquet reader, 
#4124 decimal arithmetic, #4122 EXCEPT/INTERSECT ALL)
     - All four produce silent wrong results, which the guide treats as 
`priority:critical`. They surface only on Spark 4.1.1, which is not yet a 
supported target (#4098 is the umbrella). Once 4.1 enablement lands, these need 
to be fixed before the matrix flips green.
   - Field-ID matching (#4189)
     - Manifests today on the `delta-kernel-phase-1` branch (PR #3932) where 
`nativeDeltaScan` is being added. The existing `nativeDataFusionScan` gate 
already declines field-ID requests, but Delta strips field IDs from 
`requiredSchema`, so the gate misses and produces silent wrong results when the 
user opts into the experimental Delta path. Worth flagging to the delta-kernel 
work.
   - JNI local-ref leak (#4172)
     - Currently `priority:medium`. If a long-running production executor is 
observed hitting JNI OOM tied to local-ref accumulation, escalate to 
`priority:high`.
   
   ### Skipped — needs more info
   
   - Iceberg reflection failure 
([#4125](https://github.com/apache/datafusion-comet/issues/4125))
     - The report has only the `IcebergReflection` ERROR line and a note that 
"any query" hits it on Comet 0.15 + Iceberg + Spark 3.5.6. Steps to reproduce, 
expected behavior, AWS Glue / Iceberg versions, and whether the query actually 
fails or only logs are all missing. `requires-triage` left in place.
   - Bug triage results: 2026-04-27 
([#4110](https://github.com/apache/datafusion-comet/issues/4110))
     - This is the previous triage summary issue that was auto-tagged with 
`requires-triage` when it was opened. It is not a bug — the human reviewer 
should close it when satisfied with that batch (or remove `requires-triage` 
directly). Not classifying it.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] Bug triage results: 2026-05-04 [datafusion-comet]

Reply via email to