[ https://issues.apache.org/jira/browse/SPARK-50840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913501#comment-17913501 ]
Asif edited comment on SPARK-50840 at 1/16/25 12:50 AM: -------------------------------------------------------- The below test fails, without the fix. test("SPARK-50840: Hive table created with timestamp LTZ, should retain the same on reload") { withTable("t1", "t2") { withSQLConf(SQLConf.TIMESTAMP_TYPE.key -> TimestampTypes.TIMESTAMP_LTZ.toString) { val tblDef = s""" |CREATE TABLE t1 ( | ts timestamp, | nstd Struct<name: String, ts1 timestamp> |) |using parquet""".stripMargin sql(tblDef) } withSQLConf(SQLConf.TIMESTAMP_TYPE.key -> TimestampTypes.TIMESTAMP_NTZ.toString) { def assertNoTimestampNTZ(structType: StructType): Unit = { structType.foreach { _.dataType match { case TimestampNTZType => fail("TimestampNTZType not expected") case st: StructType => assertNoTimestampNTZ(st) case _ => } } } sql("alter table t1 rename to t2") assertNoTimestampNTZ(spark.table("t2").schema) } } } was (Author: ashahid7): test("SPARK-50840: Hive table created with timestamp LTZ, should retain the same on reload") { withTable("t1", "t2") { withSQLConf(SQLConf.TIMESTAMP_TYPE.key -> TimestampTypes.TIMESTAMP_LTZ.toString) { val tblDef = s""" |CREATE TABLE t1 ( | ts timestamp, | nstd Struct<name: String, ts1 timestamp> |) |using parquet""".stripMargin sql(tblDef) } withSQLConf(SQLConf.TIMESTAMP_TYPE.key -> TimestampTypes.TIMESTAMP_NTZ.toString) { def assertNoTimestampNTZ(structType: StructType): Unit = { structType.foreach { _.dataType match { case TimestampNTZType => fail("TimestampNTZType not expected") case st: StructType => assertNoTimestampNTZ(st) case _ => } } } sql("alter table t1 rename to t2") assertNoTimestampNTZ(spark.table("t2").schema) } } } > TimestampType wrongly gets mapped to TimestampNTZ when reloading metadata > from Hive, if timestamp alias is set to NTZ > --------------------------------------------------------------------------------------------------------------------- > > Key: SPARK-50840 > URL: https://issues.apache.org/jira/browse/SPARK-50840 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 4.0.0, 3.5.4 > Reporter: Asif > Priority: Major > Labels: pull-request-available, spark-sql > > 1) Hive does not support spark's TimeStampNTZType > 2) the mapping of the default spark's LTZ timestamp is incorrect, which is a > separate and much bigger issue. > If we create a hive table with timestamp field such that timestamp alias ( > defined by property spark.sql.timestampType) is pointing to > TimestampTypes.TIMESTAMP_LTZ, the metadata in Hive will be "timestamp". > Later if we change the Timestamp alias to TimestampTypes.TIMESTAMP_NTZ, and > do any operation, such that table metadata is reloaded from Hive, the > timestamp type generated in Spark would get value of > TimestampTypes.TIMESTAMP_NTZ, instead of TimestampType ( i.e LTZ). > This would eventually result in parsing exception in Hive layer as it does > not recognize timestamp NTZ. > Will be opening a PR, with bug test -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org