[ 
https://issues.apache.org/jira/browse/CALCITE-7204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18023527#comment-18023527
 ] 

Alessandro Solimando commented on CALCITE-7204:
-----------------------------------------------

I was almost forgetting that _RexSimplify_ is used by _RelBuilder_ too, we 
might want to have an option (like 
[this|https://github.com/apache/calcite/blob/83882017b188222a06916145b0f4867b6a7e2b1f/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L5268]
 one for LIMIT 0) to avoid simplifying lossless casts, to be passed to the 
extended constructor of _RexSimplify_ proposed above{_}?{_}

> Add support for lossless cast detection for DATETIME types
> ----------------------------------------------------------
>
>                 Key: CALCITE-7204
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7204
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.40.0
>            Reporter: Alessandro Solimando
>            Priority: Major
>
> _RexUtil.isLosslessCast_ doesn't currently support date/time types at all and 
> defaults to considering casts always lossy, leading to missed opportunities 
> and potential suboptimal planning.
> The current ticket aims at adding support for the DATETIME family types.
> A proposal of what to handle, which should be re-verified precisely by the 
> implementer:
>  * *TIME(p) -> TIME(p')*
> lossless iff p' >= p (widening fractional-second precision)
> (Reverse is not guaranteed: narrowing can round away sub-second units)
>  * *TIMESTAMP(p) → TIMESTAMP(p')* (without time zone): lossless iff p' >= p
>  * *DATE → TIMESTAMP(p)* (without time zone)
> lossless (round-trip {{DATE -> TIMESTAMP -> DATE}} always recovers the 
> original date, the first cast adds padding like {{00:00:00[.000…]}} which can 
> be truncated in the second cast)
>  * *TIME(p) → TIMESTAMP(p')* (without time zone)
> lossless iff p' >= p (round-trip {{TIME -> TIMESTAMP -> TIME}} preserves the 
> time component, the synthetic date part is discarded on the second cast)
>  * *TIMESTAMP WITH LOCAL TIME ZONE (TSLTZ)*
>  ** {*}TSLTZ(p) -> TSLTZ(p'){*}: lossless iff p' >= p
>  ** {*}TSLTZ <=> TIMESTAMP (without TZ){*}: *conservatively not lossless* 
> (semantics differ: instant vs local wall-time, the DST/offset transitions can 
> change wall-time on round-trip)
>  * Optional / out-of-scope (separate ticket/tickets):
>  ** *DATE/TIME/TIMESTAMP <=> CHARACTER*
>  ** *INTERVAL* types:
>  *** YEAR-MONTH intervals: widening fields/precision is lossless
>  *** DAY-SECOND intervals: widening fractional-second precision and/or field 
> range is lossless
> Type precision: for integers types we had surprising effects (see 
> [here|https://github.com/apache/calcite/pull/4557#discussion_r2379703479]), 
> take special care in its handling and verify precisely assumptions, in doubt 
> be conservative as it's critical for the method to not return false positives 
> as it immediately affects correctness.
> Tests and impact:
>  * The newly supported cases must (at least) be covered appropriately in 
> _RexLosslessCastTest_ (positive and negative tests, see CALCITE-7174 for an 
> example)
>  * When existing plans change due to further simplifications/rule firing, 
> group changes by "patterns" and a justification for non-trivial cases
> Note: if implementing all this at once is too much, we can break it into 
> multiple tickets (for instance, TZ-aware cases can become a separate ticket, 
> in case it's fine to detect them and return false for now)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to