----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/56334/#review164773 -----------------------------------------------------------
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java (lines 141 - 146) <https://reviews.apache.org/r/56334/#comment236535> Correct me here, but if timeZoneID is empty or null, then the else block will be executed and it will check if timeZoneID (null or empty) is not contained on the available IDs? ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java (line 147) <https://reviews.apache.org/r/56334/#comment236536> This parameter is not used anymore. Can we change it to finalPath? ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java (line 178) <https://reviews.apache.org/r/56334/#comment236538> Should we log a warning and keep the timeZoneID to the default instead? I don't know if it is good to avoid reading a table that has an invalid timezone property. ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetTableUtils.java (line 21) <https://reviews.apache.org/r/56334/#comment236540> I've seen other places where the GMT hardocded string is used. Should we be consisten and use this variable instead of the GMT hardcoded value? ql/src/test/queries/clientpositive/parquet_int96_timestamp.q (line 12) <https://reviews.apache.org/r/56334/#comment236531> Could you add another test to validate the table property of a new table overrides the global option? - Sergio Pena On Feb. 8, 2017, 3:45 p.m., Barna Zsombor Klara wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/56334/ > ----------------------------------------------------------- > > (Updated Feb. 8, 2017, 3:45 p.m.) > > > Review request for hive, Ryan Blue and Sergio Pena. > > > Bugs: HIVE-12767 > https://issues.apache.org/jira/browse/HIVE-12767 > > > Repository: hive-git > > > Description > ------- > > This is a followup on this review request: https://reviews.apache.org/r/41821 > The following exit criteria is addressed in this patch: > > - Hive will read Parquet MR int96 timestamp data and adjust values using a > time zone from a table property, if set, or using the local time zone if it > is absent. No adjustment will be applied to data written by Impala. > - Hive will write Parquet int96 timestamps using a time zone adjustment from > the same table property, if set, or using the local time zone if it is > absent. This keeps the data in the table consistent. > - New tables created by Hive will set the table property to UTC if the global > option to set the property for new tables is enabled. > - Tables created using CREATE TABLE and CREATE TABLE LIKE FILE will not set > the property unless the global setting to do so is enabled. > - Tables created using CREATE TABLE LIKE <OTHER TABLE> will copy the property > of the table that is copied. > > To set the timezone table property, use this: > create table tbl1 (ts timestamp) stored as parquet tblproperties > ('parquet.mr.int96.write.zone'='PST'); > > To set UTC as default timezone table property on new tables created, use > this: > set parquet.mr.int96.enable.utc.write.zone=true; > create table tbl2 (ts timestamp) stored as parquet; > > > Diffs > ----- > > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java > cb27cd6236dc1d54efd2cafda9f1cc975e9d9817 > data/files/impala_int96_timestamp.parq PRE-CREATION > > itests/hive-jmh/src/main/java/org/apache/hive/benchmark/storage/ColumnarStorageBench.java > a14b7900afb00a7d304b0dc4f6482a2b87716919 > ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java > adabe70fa8f0fe1b990c6ac578a14ff5af06fc93 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java > 379a9135d9c631b2f473976b00f3dc87f9fec0c4 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java > 167f9b6516ac093fa30091daf6965de25e3eccb3 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java > 76d93b8e02a98c95da8a534f2820cd3e77b4bb43 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java > 604cbbcc2a9daa8594397e315cc4fd8064cc5005 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java > ac430a67682d3dcbddee89ce132fc0c1b421e368 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetTableUtils.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java > 3fd75d24f3fda36967e4957e650aec19050b22f8 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedParquetRecordReader.java > 699de59b58ecbfa0705f042d22e697f1573ea496 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedPrimitiveColumnReader.java > 3d5c6e6a092dd6a0303fadc6a244dad2e31cd853 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriteSupport.java > f4621e5dbb81e8d58c4572c901ec9d1a7ca8c012 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java > 6b7b50a25e553629f0f492e964cc4913417cb500 > > ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestDataWritableWriter.java > 934ae9f255d0c4ccaa422054fcc9e725873810d4 > > ql/src/test/org/apache/hadoop/hive/ql/io/parquet/VectorizedColumnReaderTestBase.java > 833cfdbaa2ac6a3653a8f3232344f5fb70c80686 > > ql/src/test/org/apache/hadoop/hive/ql/io/parquet/convert/TestETypeConverter.java > PRE-CREATION > > ql/src/test/org/apache/hadoop/hive/ql/io/parquet/serde/TestParquetTimestampUtils.java > ec6def5b9ac5f12e6a7cb24c4f4998a6ca6b4a8e > > ql/src/test/org/apache/hadoop/hive/ql/io/parquet/timestamp/TestNanoTimeUtils.java > PRE-CREATION > ql/src/test/queries/clientpositive/parquet_int96_timestamp.q PRE-CREATION > ql/src/test/results/clientpositive/parquet_int96_timestamp.q.out > PRE-CREATION > > Diff: https://reviews.apache.org/r/56334/diff/ > > > Testing > ------- > > qtest and unit tests added. > > > Thanks, > > Barna Zsombor Klara > >