andygrove opened a new issue, #3319: URL: https://github.com/apache/datafusion-comet/issues/3319
## Summary ~7 tests in `DisableUnnecessaryBucketedScanSuite` and `BucketedReadSuite` fail because `CometNativeScan` doesn't expose bucketing information. ## Failing Tests - `DisableUnnecessaryBucketedScanSuite`: "SPARK-32859: disable unnecessary bucketed table scan" (basic, multiple joins, multiple bucketed columns, other operators) - `DisableUnnecessaryBucketedScanSuite`: "SPARK-33075: not disable bucketed table scan for cached query" - `DisableUnnecessaryBucketedScanSuite`: "Aggregates with no groupby over tables having 1 BUCKET, return multiple rows" - `BucketedReadSuite`: "disable bucketing when the output doesn't contain all bucketing columns" - `BucketedReadSuite`: "bucket coalescing is applied when join expressions match with partitioning expressions" ## Error Pattern ``` ArrayBuffer() had length 0 instead of expected length 1 (DisableUnnecessaryBucketedScanSuite.scala:79) ``` Tests look for `FileSourceScanExec` nodes to inspect bucketing state. `CometNativeScan` isn't matched, so no scan nodes are found. ## Root Cause `CometNativeScan` replaces `FileSourceScanExec` in the plan but doesn't expose bucketing metadata (bucket count, bucket columns, etc.). Tests that inspect or modify bucketing behavior can't find the scan node. ## Related Discovered in CI for #3307 (enable native_datafusion in auto scan mode). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
