[
https://issues.apache.org/jira/browse/SPARK-57808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Max Gekk updated SPARK-57808:
-----------------------------
Shepherd: Max Gekk
> Document the microsecond-only limitation of typed encoders and UDFs for
> nanosecond timestamps
> ---------------------------------------------------------------------------------------------
>
> Key: SPARK-57808
> URL: https://issues.apache.org/jira/browse/SPARK-57808
> Project: Spark
> Issue Type: Sub-task
> Components: Documentation
> Affects Versions: 4.3.0
> Reporter: Max Gekk
> Priority: Major
>
> This sub-task is part of the umbrella SPARK-56822 (timestamps with nanosecond
> precision).
> h2. Decision
> Document only, not implement (per review).
> h2. Problem
> Typed Dataset encoders bind {{java.time}} to microsecond SQL types:
> {{ScalaReflection}} / {{JavaTypeInference}} and {{Encoders.INSTANT}} /
> {{Encoders.LOCALDATETIME}} map to {{TimestampType}} / {{TimestampNTZType}}
> (ScalaReflection.scala ~L331-332, JavaTypeInference.scala ~L99-101). As a
> result:
> * {{ds.as[CaseClassWithInstant]}} and schema-less
> {{createDataFrame(Seq(instant))}} do not preserve nanoseconds;
> * typed Scala/Java UDFs ({{udf((i: Instant) => ...)}}) coerce nanosecond
> inputs to microsecond at the UDF boundary;
> * Kryo encoders bypass the SQL type system entirely.
> Nanosecond precision is available only via the schema-driven {{Row}} /
> {{Encoders.row(schema)}} path (SPARK-57033).
> h2. Goal
> Document these as known limitations and describe the schema-driven
> workaround. No encoder/UDF code changes.
> h2. Scope
> Docs only (nanosecond-timestamp type reference / known limitations),
> cross-linked with the migration guide and the SparkR non-support note
> (SPARK-57807).
> h2. Acceptance criteria
> * Docs state the typed-encoder / {{ds.as[T]}} / typed-UDF microsecond-only
> behavior and how to retain nanoseconds via schema-driven APIs.
> h2. Testing
> Docs only.
> h2. Dependencies
> None (documentation only). Cross-links SPARK-57807 (SparkR non-support) and
> the nanosecond-timestamp docs/migration sub-task.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]