kosiew opened a new pull request, #20541:
URL: https://github.com/apache/datafusion/pull/20541

   ## Which issue does this PR close?
   
   * This implements `ceil` part of #20197.
   
   ---
   
   ## Rationale for this change
   
   DataFusion’s preimage framework can turn predicates on deterministic 
functions into equivalent predicates on the underlying column(s). For 
`ceil(x)`, the mathematical preimage for a target integer value `N` is the 
interval **(N − 1, N]**.
   
   Without a preimage implementation, filters such as `WHERE ceil(col) = 6` 
must evaluate `ceil` for every row, which can inhibit predicate pushdown and 
other optimizer wins.
   
   This PR implements `ceil`’s preimage to enable rewriting comparisons into 
simple range predicates (with careful handling of floating-point representation 
boundaries and decimals), improving the optimizer’s ability to push filters 
down to scans and reduce work during execution.
   
   ---
   
   ## What changes are included in this PR?
   
   * Implemented `ScalarUDFImpl::preimage` for the `ceil` scalar function.
   
     * Computes the preimage range for `ceil(x) = N` as a half-open interval 
suitable for the `Interval` framework.
     * Uses `next_up` for floating-point bounds so that the strict lower bound 
`(N-1, …]` is represented safely as `x >= next_up(N-1)` and the inclusive upper 
bound `… <= N` becomes `x < next_up(N)`.
     * Rejects non-integer literals (no solutions) and non-finite float 
literals (NaN/±Inf).
     * Avoids unsafe rewrites when `N - 1` collapses to `N` due to float 
spacing (e.g., above `2^53` for `f64`, above `2^24` for `f32`).
   * Added decimal preimage support for `Decimal32/64/128/256`.
   
     * Validates that the literal has no fractional part at the declared scale.
     * Computes bounds using the decimal unit at the target scale (step = 
`10^-scale`) to represent `(N-1, N]` as `[N-1+step, N+step)`.
     * Handles scale 0 (integer decimals) as `[N, N+1)`.
   * Added unit tests covering:
   
     * Valid ranges for floats (positive/negative/zero), integers, and decimals.
     * Non-integer literals returning `PreimageResult::None`.
     * Overflow and float boundary conditions.
     * NULL literals.
   * Added a new SQLLogicTest file `ceil_preimage.slt`.
   
     * Verifies correctness of results for representative types (Float64, Int32 
via coercion, Decimal).
     * Verifies optimizer rewrites via `EXPLAIN` for `=`, `IN`, `IS [NOT] 
DISTINCT FROM`, and boundary cases.
   
   ---
   
   ## Are these changes tested?
   
   Yes.
   
   * **Rust unit tests** added in `datafusion/functions/src/math/ceil.rs` 
validate:
   
     * Correct range generation for supported scalar types.
     * Correct rejection of non-integer / non-finite float literals.
     * Overflow and precision boundary handling.
     * Decimal scale/precision behavior and NULL handling.
   * **SQLLogicTest** added in 
`datafusion/sqllogictest/test_files/ceil_preimage.slt` validates:
   
     * Query result correctness for rewritten predicates.
     * Logical plan rewrites using `EXPLAIN` (including float `next_up` bounds 
and decimal bounds).
   
   ---
   
   ## Are there any user-facing changes?
   
   No user-visible behavior changes are intended. The semantics of `ceil` are 
unchanged.
   
   This is an optimizer improvement that may:
   
   * Produce different (but equivalent) logical plans when predicates involve 
`ceil`.
   * Improve performance for queries that filter on `ceil(col)` by enabling 
range filtering and better predicate pushdown.
   
   No documentation updates are required.
   
   ---
   
   ## LLM-generated code disclosure
   
   This PR includes LLM-generated code and comments. All LLM-generated content 
has been manually reviewed and tested.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to