Max Gekk created SPARK-57338:
--------------------------------
Summary: Row.json throws ClassCastException on TIME (TimeType)
columns
Key: SPARK-57338
URL: https://issues.apache.org/jira/browse/SPARK-57338
Project: Spark
Issue Type: Sub-task
Components: SQL
Affects Versions: 4.3.0
Reporter: Max Gekk
Assignee: Max Gekk
h2. Description
{{Row.json()}} and {{Row.prettyJson()}} throw a {{ClassCastException}} when the
row
contains a {{TIME}} ({{TimeType}}) column and the Types Framework is enabled
({{spark.sql.types.framework.enabled=true}}, the default under tests).
h2. Root cause
A public {{Row}} holds *external* values - for {{TimeType}} that is a
{{java.time.LocalTime}}. {{Row.jsonValue}} rendered the value through the Types
Framework's {{format}}, which expects the *internal* representation ({{Long}}
nanoseconds of day) and performs {{value.asInstanceOf[Long]}}. Handing it an
external
{{LocalTime}} fails the cast.
h2. Reproduction
{code:scala}
import java.time.LocalTime
import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema
import org.apache.spark.sql.types._
val row = new GenericRowWithSchema(
Array(LocalTime.of(12, 13, 14)),
new StructType().add("a", TimeType()))
row.json
{code}
Result:
{noformat}
java.lang.ClassCastException: class java.time.LocalTime cannot be cast to class
java.lang.Long
{noformat}
h2. Fix
Route the external value in {{Row.jsonValue}} through the framework's
{{formatExternal}} (the external-value entry point) instead of {{format}}. A
framework
type without an external formatter falls back to {{format}}, preserving existing
behavior - e.g. the nanosecond timestamp types keep raising the
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]