Max Gekk created SPARK-57317:
--------------------------------

             Summary: Fix Literal.create for external nanosecond timestamp 
values
                 Key: SPARK-57317
                 URL: https://issues.apache.org/jira/browse/SPARK-57317
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 4.3.0
            Reporter: Max Gekk
            Assignee: Max Gekk


Literal.create(value, dataType) produces an invalid literal when the value is an
external (high-level) nanosecond timestamp value and the declared type is a
nanosecond timestamp type (TimestampLTZNanosType / TimestampNTZNanosType), or a
complex type (array/map/struct) containing one.

For these types the method routed the value through the schema-less
CatalystTypeConverters.convertToCatalyst, which by design (SPARK-57033) keeps
bare java.time.Instant and java.time.LocalDateTime on the microsecond 
converters.
As a result the produced Catalyst value is a Long (epoch micros) instead of the
internal TimestampNanosVal representation expected by the declared type, and
Literal validation fails, e.g.:

    java.lang.IllegalArgumentException: requirement failed: Literal must have a
    corresponding value to timestamp_ltz(7), but class Long found.

The same problem affects collections of such values, e.g.:

    Literal must have a corresponding value to array<timestamp_ntz(9)>, but 
class
    GenericArrayData found.

Fix: Literal.create now routes the value through the schema-driven converter
(CatalystTypeConverters.createToCatalystConverter) when the declared type 
contains
a nanosecond timestamp type anywhere, but only for external values. Values 
already
in Catalyst internal form (TimestampNanosVal, ArrayData, MapData, InternalRow) 
and
nulls keep using the lenient schema-less path, preserving the behavior of 
callers
such as Literal.default that pass internal values.

This gap was surfaced while adding the nanosecond timestamp types to
DataTypeTestUtils (SPARK-57259), which drives PredicateSuite's generic
"IN with different types" coverage over these types.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to