[ https://issues.apache.org/jira/browse/HIVE-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15671191#comment-15671191 ]
Vitalii Diravka commented on HIVE-9482: --------------------------------------- Why this hive.parquet.timestamp.skip.conversion option is enabled by default? Since according [parquet spec|https://github.com/Parquet/parquet-format/blob/master/LogicalTypes.md#timestamp_millis], parquet files don't keep local timezone. And we cann't distinguish from file what was the value of that option while parquet file was generating. > Hive parquet timestamp compatibility > ------------------------------------ > > Key: HIVE-9482 > URL: https://issues.apache.org/jira/browse/HIVE-9482 > Project: Hive > Issue Type: Bug > Components: File Formats > Affects Versions: 0.15.0 > Reporter: Szehon Ho > Assignee: Szehon Ho > Fix For: 1.2.0 > > Attachments: HIVE-9482.2.patch, HIVE-9482.patch, HIVE-9482.patch, > parquet_external_time.parq > > > In current Hive implementation, timestamps are stored in UTC (converted from > current timezone), based on original parquet timestamp spec. > However, we find this is not compatibility with other tools, and after some > investigation it is not the way of the other file formats, or even some > databases (Hive Timestamp is more equivalent of 'timestamp without timezone' > datatype). > This is the first part of the fix, which will restore compatibility with > parquet-timestamp files generated by external tools by skipping conversion on > reading. > Later fix will change the write path to not convert, and stop the > read-conversion even for files written by Hive itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)