Max Gekk created SPARK-57338:
--------------------------------

             Summary: Row.json throws ClassCastException on TIME (TimeType) 
columns
                 Key: SPARK-57338
                 URL: https://issues.apache.org/jira/browse/SPARK-57338
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 4.3.0
            Reporter: Max Gekk
            Assignee: Max Gekk


h2. Description

{{Row.json()}} and {{Row.prettyJson()}} throw a {{ClassCastException}} when the 
row
contains a {{TIME}} ({{TimeType}}) column and the Types Framework is enabled
({{spark.sql.types.framework.enabled=true}}, the default under tests).

h2. Root cause

A public {{Row}} holds *external* values - for {{TimeType}} that is a
{{java.time.LocalTime}}. {{Row.jsonValue}} rendered the value through the Types
Framework's {{format}}, which expects the *internal* representation ({{Long}}
nanoseconds of day) and performs {{value.asInstanceOf[Long]}}. Handing it an 
external
{{LocalTime}} fails the cast.

h2. Reproduction

{code:scala}
import java.time.LocalTime
import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema
import org.apache.spark.sql.types._

val row = new GenericRowWithSchema(
  Array(LocalTime.of(12, 13, 14)),
  new StructType().add("a", TimeType()))

row.json
{code}

Result:

{noformat}
java.lang.ClassCastException: class java.time.LocalTime cannot be cast to class 
java.lang.Long
{noformat}

h2. Fix

Route the external value in {{Row.jsonValue}} through the framework's
{{formatExternal}} (the external-value entry point) instead of {{format}}. A 
framework
type without an external formatter falls back to {{format}}, preserving existing
behavior - e.g. the nanosecond timestamp types keep raising the



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to