[ https://issues.apache.org/jira/browse/SPARK-50840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913899#comment-17913899 ]
Wing Yew Poon edited comment on SPARK-50840 at 1/17/25 12:52 AM: ----------------------------------------------------------------- Point 2) in the description about mapping refers to the fact that in Hive 3, there are two timestamp types, `timestamp` and `timestamp with local time zone`. However, since SPARK-44114 is open, Spark still depends on Hive 2.3, which only has the single `timestamp` type. If Spark can use Hive 3+, then a rational mapping would be `timestamp` <=> TimestampNTZType and `timestamp with local time zone` <=> TimestampLTZType. was (Author: wypoon): Point 2) in the description about mapping refers to the fact that in Hive 3, there are two timestamp types, `timestamp` and `timestamp with local time zone`. However, since SPARK-44114 is open, Spark still depends on Hive 2.3, which only has the single `timestamp` type. If Spark can use Hive 3+, then a rational mapping would be `timestamp` <-> TimestampNTZType and `timestamp with local time zone` <-> TimestampLTZType. > TimestampType wrongly gets mapped to TimestampNTZ when reloading metadata > from Hive, if timestamp alias is set to NTZ > --------------------------------------------------------------------------------------------------------------------- > > Key: SPARK-50840 > URL: https://issues.apache.org/jira/browse/SPARK-50840 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 4.0.0, 3.5.4 > Reporter: Asif > Priority: Major > Labels: pull-request-available, spark-sql > > 1) Hive does not support spark's TimeStampNTZType > 2) the mapping of the default spark's LTZ timestamp is incorrect, which is a > separate and much bigger issue. > If we create a hive table with timestamp field such that timestamp alias ( > defined by property spark.sql.timestampType) is pointing to > TimestampTypes.TIMESTAMP_LTZ, the metadata in Hive will be "timestamp". > Later if we change the Timestamp alias to TimestampTypes.TIMESTAMP_NTZ, and > do any operation, such that table metadata is reloaded from Hive, the > timestamp type generated in Spark would get value of > TimestampTypes.TIMESTAMP_NTZ, instead of TimestampType ( i.e LTZ). > This would eventually result in parsing exception in Hive layer as it does > not recognize timestamp NTZ. > Will be opening a PR, with bug test -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org