[ https://issues.apache.org/jira/browse/HIVE-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15528942#comment-15528942 ]
ASF GitHub Bot commented on HIVE-13873: --------------------------------------- GitHub user winningsix opened a pull request: https://github.com/apache/hive/pull/105 HIVE-13873 Column pruning for nested fields You can merge this pull request into a Git repository by running: $ git pull https://github.com/winningsix/hive HIVE-13873 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hive/pull/105.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #105 ---- commit ea462c256f773410c7023dcbfbe365c7cc8200b6 Author: Ferdinand Xu <cheng.a...@intel.com> Date: 2016-09-28T01:15:51Z HIVE-13873 Column pruning for nested fields ---- > Column pruning for nested fields > -------------------------------- > > Key: HIVE-13873 > URL: https://issues.apache.org/jira/browse/HIVE-13873 > Project: Hive > Issue Type: New Feature > Components: Logical Optimizer > Reporter: Xuefu Zhang > Assignee: Ferdinand Xu > Attachments: HIVE-13873.wip.patch > > > Some columnar file formats such as Parquet store fields in struct type also > column by column using encoding described in Google Dramel pager. It's very > common in big data where data are stored in structs while queries only needs > a subset of the the fields in the structs. However, presently Hive still > needs to read the whole struct regardless whether all fields are selected. > Therefore, pruning unwanted sub-fields in struct or nested fields at file > reading time would be a big performance boost for such scenarios. -- This message was sent by Atlassian JIRA (v6.3.4#6332)