andygrove opened a new pull request, #20452:
URL: https://github.com/apache/datafusion/pull/20452
## Summary
- Replace per-row runtime `DataType` matching in `is_join_arrays_equal()`
and `compare_join_arrays()` with a `JoinComparator` struct that resolves typed
comparison function pointers once during `SortMergeJoinStream` construction
- Eliminates the overhead of matching on 20+ `DataType` variants for every
row comparison in the merge loop
- The `JoinComparator` provides two methods:
- `compare()` — for merge-loop ordering decisions (streamed vs buffered
advance)
- `is_equal()` — for buffered batch key-group expansion
## Benchmark Results
Best of 3 iterations across 20 SMJ benchmark queries (`cargo run --release
-p datafusion-benchmarks --bin dfbench -- smj`):
| Query | Description | Baseline (ms) | Optimized (ms) | Change |
|-------|-------------|---------------|----------------|--------|
| Q1 | INNER 100K×100K, 1:1 | 3.16 | 2.22 | **-29.9%** |
| Q2 | INNER 100K×1M, 1:10 | 10.41 | 10.02 | -3.8% |
| Q3 | INNER 1M×1M, 1:100 | 53.06 | 55.55 | +4.7% |
| Q4 | INNER 100K×1M, 1:10, 1% filter | 3.33 | 3.20 | -4.2% |
| Q5 | INNER 1M×1M, 1:100, 10% filter | 11.41 | 11.85 | +3.9% |
| Q6 | LEFT 100K×1M, 1:10 | 10.10 | 9.97 | -1.3% |
| Q7 | LEFT 100K×1M, 1:10, 50% filter | 11.75 | 11.91 | +1.4% |
| Q8 | FULL 100K×100K, 1:10 | 2.53 | 2.52 | -0.4% |
| Q9 | FULL 100K×1M, 1:10, 10% filter | 11.32 | 11.02 | -2.7% |
| Q10 | LEFT SEMI 100K×1M, 1:10 | 4.42 | 4.35 | -1.6% |
| Q11 | LEFT SEMI 100K×1M, 1:10, 1% filter | 3.97 | 3.99 | +0.5% |
| Q12 | LEFT SEMI 100K×1M, 1:10, 50% filter | 59.28 | 59.15 | -0.2% |
| Q13 | LEFT SEMI 100K×1M, 1:10, 90% filter | 4.67 | 4.50 | -3.6% |
| Q14 | LEFT ANTI 100K×1M, 1:10 | 4.40 | 4.34 | -1.6% |
| Q15 | LEFT ANTI 100K×1M, 1:10, partial | 4.42 | 4.36 | -1.4% |
| Q16 | LEFT ANTI 100K×100K, 1:1, stress | 2.14 | 2.21 | +3.3% |
| Q17 | INNER 100K×5M, 1:50, 5% filter | 8.86 | 7.75 | **-12.5%** |
| Q18 | LEFT SEMI 100K×5M, 1:50, 2% filter | 8.07 | 8.03 | -0.5% |
| Q19 | LEFT ANTI 100K×5M, 1:50, partial | 19.52 | 18.88 | -3.3% |
| Q20 | INNER 1M×10M, 1:100 + GROUP BY | 533.16 | 559.54 | +4.9% |
The biggest wins are on comparison-dominated workloads (Q1: 1:1 join, Q17:
filtered 1:50 join). High-cardinality joins (Q3, Q5, Q20) where output
construction dominates show no significant change.
## Test plan
- [x] All 48 `sort_merge_join` unit tests pass
- [x] `cargo fmt` clean
- [x] `cargo clippy` clean (zero warnings)
- [x] Benchmark comparison shows no regressions beyond noise
🤖 Generated with [Claude Code](https://claude.com/claude-code)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]