andygrove opened a new issue, #3319:
URL: https://github.com/apache/datafusion-comet/issues/3319

   ## Summary
   
   ~7 tests in `DisableUnnecessaryBucketedScanSuite` and `BucketedReadSuite` 
fail because `CometNativeScan` doesn't expose bucketing information.
   
   ## Failing Tests
   
   - `DisableUnnecessaryBucketedScanSuite`: "SPARK-32859: disable unnecessary 
bucketed table scan" (basic, multiple joins, multiple bucketed columns, other 
operators)
   - `DisableUnnecessaryBucketedScanSuite`: "SPARK-33075: not disable bucketed 
table scan for cached query"
   - `DisableUnnecessaryBucketedScanSuite`: "Aggregates with no groupby over 
tables having 1 BUCKET, return multiple rows"
   - `BucketedReadSuite`: "disable bucketing when the output doesn't contain 
all bucketing columns"
   - `BucketedReadSuite`: "bucket coalescing is applied when join expressions 
match with partitioning expressions"
   
   ## Error Pattern
   
   ```
   ArrayBuffer() had length 0 instead of expected length 1 
(DisableUnnecessaryBucketedScanSuite.scala:79)
   ```
   
   Tests look for `FileSourceScanExec` nodes to inspect bucketing state. 
`CometNativeScan` isn't matched, so no scan nodes are found.
   
   ## Root Cause
   
   `CometNativeScan` replaces `FileSourceScanExec` in the plan but doesn't 
expose bucketing metadata (bucket count, bucket columns, etc.). Tests that 
inspect or modify bucketing behavior can't find the scan node.
   
   ## Related
   
   Discovered in CI for #3307 (enable native_datafusion in auto scan mode).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to