[ 
https://issues.apache.org/jira/browse/CALCITE-7204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18023342#comment-18023342
 ] 

Mihai Budiu commented on CALCITE-7204:
--------------------------------------

To make it clearer, in our implementation TIME = TIME(9), and TIMESTAMP = 
TIMESTAMP(3).

So going from TIME to TIMESTAMP and back is actually lossy (no matter which 
scale is used for these types).

> Add support for lossless cast detection for DATETIME types
> ----------------------------------------------------------
>
>                 Key: CALCITE-7204
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7204
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.40.0
>            Reporter: Alessandro Solimando
>            Priority: Major
>
> _RexUtil.isLosslessCast_ doesn't currently support date/time types at all and 
> defaults to considering casts always lossy, leading to missed opportunities 
> and potential suboptimal planning.
> The current ticket aims at adding support for the DATETIME family types.
> A proposal of what to handle, which should be re-verified precisely by the 
> implementer:
>  * *TIME(p) -> TIME(p')*
> lossless iff p' >= p (widening fractional-second precision)
> (Reverse is not guaranteed: narrowing can round away sub-second units)
>  * *TIMESTAMP(p) → TIMESTAMP(p')* (without time zone): lossless iff p' >= p
>  * *DATE → TIMESTAMP(p)* (without time zone)
> lossless (round-trip {{DATE -> TIMESTAMP -> DATE}} always recovers the 
> original date, the first cast adds padding like {{00:00:00[.000…]}} which can 
> be truncated in the second cast)
>  * *TIME(p) → TIMESTAMP(p')* (without time zone)
> lossless iff p' >= p (round-trip {{TIME -> TIMESTAMP -> TIME}} preserves the 
> time component, the synthetic date part is discarded on the second cast)
>  * *TIMESTAMP WITH LOCAL TIME ZONE (TSLTZ)*
>  ** {*}TSLTZ(p) -> TSLTZ(p'){*}: lossless iff p' >= p
>  ** {*}TSLTZ <=> TIMESTAMP (without TZ){*}: *conservatively not lossless* 
> (semantics differ: instant vs local wall-time, the DST/offset transitions can 
> change wall-time on round-trip)
>  * Optional / out-of-scope (separate ticket/tickets):
>  ** *DATE/TIME/TIMESTAMP <=> CHARACTER*
>  ** *INTERVAL* types:
>  *** YEAR-MONTH intervals: widening fields/precision is lossless
>  *** DAY-SECOND intervals: widening fractional-second precision and/or field 
> range is lossless
> Type precision: for integers types we had surprising effects (see 
> [here|https://github.com/apache/calcite/pull/4557#discussion_r2379703479]), 
> take special care in its handling and verify precisely assumptions, in doubt 
> be conservative as it's critical for the method to not return false positives 
> as it immediately affects correctness.
> Tests and impact:
>  * The newly supported cases must (at least) be covered appropriately in 
> _RexLosslessCastTest_ (positive and negative tests, see CALCITE-7174 for an 
> example)
>  * When existing plans change due to further simplifications/rule firing, 
> group changes by "patterns" and a justification for non-trivial cases
> Note: if implementing all this at once is too much, we can break it into 
> multiple tickets (for instance, TZ-aware cases can become a separate ticket, 
> in case it's fine to detect them and return false for now)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to