[PR] Remove O(n log n) sorting/allocation in HashJoin dynamic filter accumulator by pre-indexing bounds per partition [datafusion]

via GitHub Fri, 22 Aug 2025 06:12:25 -0700


kosiew opened a new pull request, #17286:
URL: https://github.com/apache/datafusion/pull/17286


   
   ## Which issue does this PR close?
   
   * Closes #17280.
   
   ## Rationale for this change
   
   The accumulator previously collected build-side partition bounds and then 
**sorted** them with `sorted_by_key`, which:
   
   * Introduced **extra allocations** and
   * Added **O(n log n)** overhead on the number of completed partitions.
   
   Since partitions already have stable IDs, we can **pre-index** bounds by 
partition ID and avoid sorting entirely. This makes dynamic filter construction 
**O(n)** with fewer allocations, improves predictability, and eliminates a 
source of nondeterminism tied to completion order.
   
   ## What changes are included in this PR?
   
   * Replaced `PartitionBounds` + `sorted_by_key` with a **preallocated 
`Vec<Option<Vec<ColumnBounds>>>`** indexed by partition ID.
   * Eliminated sorting and the dependency on `itertools`, reducing allocations 
and algorithmic overhead.
   * Updated accumulator logic to:
   
     * **Bounds insertion in O(1)** at the correct index (by partition ID).
     * Validate out-of-range partition IDs and return a clear internal error 
instead of panicking.
     * Build the dynamic filter once **all partitions have reported**, ignoring 
missing partitions.
   * Adjusted `create_filter_from_partition_bounds` to iterate the fixed-index 
vector and construct predicates without any intermediate sorting/allocation.
   * Kept/clarified determinism as a by-product: completion order no longer 
affects the resulting predicate.
   
   ## Are these changes tested?
   
   Yes.
   
   * Added an async test `test_hashjoin_dynamic_filter_pushdown_out_of_order` 
that intentionally reverses completion order of build-side partitions across 
runs and asserts the resulting dynamic filter predicate string is identical, 
proving order independence while validating logic.
   * Existing join and dynamic filter tests continue to pass.
   
   ## Are there any user-facing changes?
   
   No API-breaking changes.
   
   * Internals of dynamic filter construction were optimized for efficiency and 
determinism.
   * Query semantics remain unchanged, but performance improves due to reduced 
allocations and removal of sorting overhead.
   
   ---
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

[PR] Remove O(n log n) sorting/allocation in HashJoin dynamic filter accumulator by pre-indexing bounds per partition [datafusion]

Reply via email to