Max Gekk created SPARK-57303:
--------------------------------

             Summary: Store-assignment and up-cast rules for 
nanosecond-precision timestamp types
                 Key: SPARK-57303
                 URL: https://issues.apache.org/jira/browse/SPARK-57303
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 4.3.0
            Reporter: Max Gekk


h2. Background

Nanosecond-precision timestamp types (TIMESTAMP_NTZ(p) / TIMESTAMP_LTZ(p), p in 
[7, 9], backed by TimestampNanosVal) now support casting to/from strings 
(SPARK-57211, SPARK-57256) and to/from their microsecond counterparts 
(SPARK-57293). However, there are no store-assignment or up-cast rules tailored 
to these types:

* They fall through the generic {{(_: DatetimeType, _: DatetimeType)}} arm in 
{{Cast.canANSIStoreAssign}}, so ANSI store assignment would silently truncate 
sub-microsecond digits (handled only narrowly for the micros<->nanos pair in 
SPARK-57293).
* They are absent from {{UpCastRule.canUpCast}}, so STRICT store assignment and 
up-cast resolution reject even lossless widening.

h2. Goal

Define a complete, precision-safe store-assignment / up-cast contract for the 
whole LTZ/NTZ timestamp family across micro and nanosecond precisions:

* STRICT policy ({{Cast.canUpCast}}): allow lossless widening, reject lossy 
narrowing.
* ANSI policy ({{Cast.canANSIStoreAssign}}): allow widening, block lossy 
narrowing so it can never silently truncate.
* LEGACY policy and explicit CAST are unchanged (they still truncate on 
narrowing).

h2. Rule

Introduce a single notion of effective fractional-second precision for the 
LTZ/NTZ timestamp family:

* {{TimestampType}} (LTZ micros) and {{TimestampNTZType}} (NTZ micros) -> 6
* {{TimestampLTZNanosType(p)}} / {{TimestampNTZNanosType(p)}} -> p (7, 8, or 9)

For any ordered pair of timestamp-family types (including across the LTZ/NTZ 
boundary, which Spark already treats as a mutual up-cast for the micros types):

* target precision >= source precision: lossless widening -> up-cast (STRICT 
and ANSI allowed)
* target precision <  source precision: lossy narrowing -> not an up-cast; 
blocked under ANSI

This deliberately diverges from the existing TimeType(p) model (which adds no 
widening to canUpCast and allows silent narrowing under ANSI); the divergence 
is the chosen precision-safe behavior.

h2. Scope

All 8 LTZ/NTZ timestamp types: TIMESTAMP, TIMESTAMP_NTZ, and TIMESTAMP_LTZ(p) / 
TIMESTAMP_NTZ(p) for p in [7, 9], including cross-family (LTZ <-> NTZ) pairs.

h2. Approach

* {{UpCastRule.canUpCast}}: add a {{tsFractionalPrecision}} helper and a single 
lossless-widening arm (subsuming the existing TimestampType <-> 
TimestampNTZType cases).
* {{Cast.canANSIStoreAssign}}: generalize the SPARK-57293 narrowing block to 
reject all timestamp-family narrowing via the same precision helper, before the 
generic DatetimeType arm. DATE/TIME and equal-precision LTZ<->NTZ conversions 
are unaffected.

h2. Dependencies

The rule layer is intentionally written ahead of the casts. Store assignment 
only succeeds if the inserted Cast resolves, so each allowed pair needs its 
canCast/canAnsiCast arm:

* Already exist: string <-> nanos; micros <-> nanos same-family (SPARK-57293).
* Required by separate subtasks before the corresponding rule actually permits 
a write: nanos(p1) <-> nanos(p2) precision change; cross-family LTZ(p) <-> 
NTZ(p) and cross-family micros <-> nanos.

Until those casts merge, the new entries are dormant for the unimplemented 
pairs (a write fails at cast creation, not at the policy check).

h2. Out of scope

* The precision-to-precision and cross-family casts themselves (separate 
subtasks).
* Implicit type coercion: {{findWiderDateTimeType}} has no arms for the nanos 
types and currently throws a MatchError for nanos datetime pairs (e.g. UNION of 
TIMESTAMP_NTZ and TIMESTAMP_NTZ(9)); tracked as a companion fix.
* DATE <-> nanos conversions.

h2. Testing

* Update the SPARK-57293 store-assignment/up-cast contract test (widening 
canUpCast assertions flip from false to true).
* Add a full-matrix predicate test over all 8 timestamp types: canUpCast and 
canANSIStoreAssign are true iff target precision >= source precision, false 
otherwise; plus anchors that TIMESTAMP -> DATE stays allowed under ANSI and 
DATE -> TIMESTAMP stays an up-cast.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to