----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/35950/ -----------------------------------------------------------
(Updated June 28, 2015, 12:24 a.m.) Review request for hive, Ryan Blue, cheng xu, and Dong Chen. Bugs: HIVE-11131 https://issues.apache.org/jira/browse/HIVE-11131 Repository: hive-git Description ------- Implemented data type writers that will be created before the first Hive row is written to Parquet. These writers contain information about object inspectors and schema of a specific data type, and calls the specific addXXXX() method used by Parquet for each data type. Diffs (updated) ----- ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java c195c3ec3ddae19bf255fc2c9633f8bf4390f428 Diff: https://reviews.apache.org/r/35950/diff/ Testing ------- Tests from TestDataWritableWriter run OK. I run other tests with micro-becnhmarks, and I got some better results from this new implemntation: Using repeated rows across the file, the speed increased in: bigint boolean double float int string 33.42% 53.66% 35.62% 35.70% 36.02% 5.93% Using random rows across the file, the speed increased in: bigint boolean double float int string 18.38% 35.52% 44.73% 13.80% 10.68% 10.00% Thanks, Sergio Pena