[ https://issues.apache.org/jira/browse/HIVE-16418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15971840#comment-15971840 ]
Ashutosh Chauhan commented on HIVE-16418: ----------------------------------------- We need to think about storage type for Timestamp in different stages of query processing: * On-disk format : Whether to store TZ or not. Primary concern is fidelity of original data and secondary concern is storage efficiency. * In-memory format : On which computations are performed. As I see it, our current Timestamp choice here is inappropriate. Issue is java.sql.Timestamp (which implicitly assumes local Timezone) doesnt correspond to either sql Timestamp (which is essentially zoneless ) or Timestamp with Timezone (which has zone, but java.sql.Timestamp doesnt allow you to set). As I suggested in-memory representation (i.e. on which all computations are performed) should either directly use LocalTimeZone and ZonedTimeZone or model its behavior on it. * Serialization format: To transfer timestamp between different vertices. Here primary concern is performance which comes if TZ is stored separately. In light of above, I am ok with your proposal of using choice #2, but I think you still need to think about in-memory format. Because apart from to_utc_timestamp and related udfs implementing new type : Timestamp with Time Zone with java.sql.Timestamp will be error-prone. > Allow HiveKey to skip some bytes for comparison > ----------------------------------------------- > > Key: HIVE-16418 > URL: https://issues.apache.org/jira/browse/HIVE-16418 > Project: Hive > Issue Type: New Feature > Reporter: Rui Li > Assignee: Rui Li > Attachments: HIVE-16418.1.patch > > > The feature is required when we have to serialize some fields and prevent > them from being used in comparison, e.g. HIVE-14412. -- This message was sent by Atlassian JIRA (v6.3.15#6346)