sesteves opened a new issue, #22413:
URL: https://github.com/apache/datafusion/issues/22413
### Describe the bug
`Statistics::with_fetch` unconditionally returns `Precision::Exact(0)` when
`nr <= skip`, even when the input `num_rows` is `Precision::Inexact`. This
precision promotion causes `AggregateStatistics` to falsely fold `COUNT(*)` to
literal 0 on multi-way joins.
**DataFusion version:** 53.1.0
**The problem**
In `stats.rs:454-464`, both `Exact(nr)` and `Inexact(nr)` fall into the same
match arm:
```rust
Statistics {
num_rows: Precision::Exact(nr),
..
}
| Statistics {
num_rows: Precision::Inexact(nr),
..
} => {
if nr <= skip {
Precision::Exact(0) // ← promotes Inexact(0) to Exact(0)
}
```
When `nr = 0` and `skip = 0`, the condition `0 <= 0` is true and the
function returns `Exact(0)` regardless of whether the input was `Exact` or
`Inexact`.
**How this causes incorrect query results**
On multi-way joins (4+ tables), `estimate_disjoint_inputs` can produce a
false disjoint detection when per-partition column min/max ranges appear
non-overlapping (this depends on the physical partition layout and varies
between runs). The chain:
1. `estimate_disjoint_inputs` detects false disjoint ranges on join key
columns.
2. `estimate_join_statistics` wraps the cardinality as `Inexact(0)` (line
452, always `Inexact`).
3. `HashJoinExec::partition_statistics` calls `stats.with_fetch(self.fetch,
0, 1)`. With `nr=0, skip=0`, `with_fetch` promotes `Inexact(0)` to `Exact(0)`.
4. `Count::value_from_stats` (count.rs:371) requires `Precision::Exact` on
`num_rows` to fold `COUNT(*)`. The false `Exact(0)` matches, and
`AggregateStatistics` replaces the aggregate with `PlaceholderRowExec`,
returning 0.
The bug is **flaky** because it depends on the partition layout producing
non-overlapping min/max ranges for join keys in the merged partition
statistics. Different data distributions, partition counts, or memory
constraints change the layout and may or may not trigger the false disjoint
detection.
**Example: TPC-H 4-table join**
The following query over TPC-H data can return `cnt = 0` instead of the
correct result (~2.4M rows at SF150) when partition statistics happen to
produce disjoint min/max ranges on the join keys:
```sql
SELECT COUNT(*) AS cnt
FROM part, partsupp, supplier, nation
WHERE p_partkey = ps_partkey
AND s_suppkey = ps_suppkey
AND s_nationkey = n_nationkey
AND p_size = 15;
```
The conditions that trigger the bug:
- 4+ tables joined via inner joins (deeper join trees propagate column stats
through more layers).
- Leaf tables with `Exact` column min/max stats (common with Parquet file
metadata).
- A selective filter (`p_size = 15`) that narrows the apparent min/max range
of join keys in intermediate join outputs.
- Multiple partitions whose merged statistics can appear disjoint.
This is the join skeleton of TPC-H Q2. The flakiness means it may not
reproduce on every run; reducing the memory pool size (forcing different
partition layouts) or increasing the number of partitions increases the
probability.
### To Reproduce
The simplest deterministic reproduction is a unit test against
`Statistics::with_fetch`:
```rust
#[test]
fn test_with_fetch_preserves_inexact_precision() {
let stats = Statistics {
num_rows: Precision::Inexact(0),
total_byte_size: Precision::Absent,
column_statistics: vec![],
};
// fetch=None, skip=0, n_partitions=1
let result = stats.with_fetch(None, 0, 1).unwrap();
// Should preserve Inexact, not promote to Exact
assert_eq!(result.num_rows, Precision::Inexact(0));
}
```
This test currently fails:
```
assertion `left == right` failed
left: Exact(0)
right: Inexact(0)
```
### Expected behavior
When `nr <= skip` and the input `num_rows` is `Inexact`, `with_fetch` should
return `Inexact(0)` (preserving the precision level), not `Exact(0)`.
### Suggested fix
Preserve the precision when `nr <= skip`:
```rust
if nr <= skip {
if self.num_rows.is_exact().unwrap_or(false) {
Precision::Exact(0)
} else {
Precision::Inexact(0)
}
}
```
### Additional context
- Related: #20388 (`FilterExec` converts Absent stats to `Exact(NULL)`,
another false-precision-promotion bug in the same area).
- I hit this while debugging a flaky `COUNT(*) = 0` regression on a 4-table
inner join over TPC-H SF150 data with DataFusion 53.1.0.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]