Re: Review Request 48716: HIVE-13873 Column pruning for nested fields

cheng xu Wed, 06 Jul 2016 18:03:07 -0700


> On July 6, 2016, 10:48 p.m., Aihua Xu wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java, 
> > line 122
> > <https://reviews.apache.org/r/48716/diff/1/?file=1419370#file1419370line122>
> >
> >     Just try to understand the logic (not too familiar with Parquet). So 
> > the underneath parquet already supports "hive.io.file.readgroup.paths" or 
> > this is totally within hive? How are the struct data stored in parquet and 
> > pruned with the group path in general?


Parquet doesn't support this configuration. We reconstruct the requested schema 
in Hive side by pruning unneeded columns like other projection does.


- cheng


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48716/#review140991
-----------------------------------------------------------


On June 15, 2016, 11:34 a.m., cheng xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48716/
> -----------------------------------------------------------
> 
> (Updated June 15, 2016, 11:34 a.m.)
> 
> 
> Review request for hive and Xuefu Zhang.
> 
> 
> Bugs: HIVE-13873
>     https://issues.apache.org/jira/browse/HIVE-13873
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Add group projection support for Parquet and this is the initial patch 
> sharing my thoughts.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java dff1815 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 23abec3 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java 6afe957 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 24bf506 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java cfedf35 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ProjectionPusher.java 
> db923fa 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveStructConverter.java
>  a89aa4d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java
>  3e38cc7 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java
>  74a1a82 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcCtx.java 
> 611a6b7 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java 
> a2a7f00 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 8cf261d 
>   ql/src/test/queries/clientpositive/parquet_struct.q PRE-CREATION 
>   ql/src/test/results/clientpositive/parquet_struct.q.out PRE-CREATION 
>   serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java 
> 0c7ac30 
> 
> Diff: https://reviews.apache.org/r/48716/diff/
> 
> 
> Testing
> -------
> 
> Newly added qtest passed.
> 
> 
> Thanks,
> 
> cheng xu
> 
>

Re: Review Request 48716: HIVE-13873 Column pruning for nested fields

Reply via email to