[ https://issues.apache.org/jira/browse/HIVE-19103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425198#comment-16425198 ]
ASF GitHub Bot commented on HIVE-19103: --------------------------------------- GitHub user ashish-kumar-sharma opened a pull request: https://github.com/apache/hive/pull/330 HIVE-19103: Reading required column only in nested structure schema in ORC You can merge this pull request into a Git repository by running: $ git pull https://github.com/Flipkart/hive requiredColumn Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hive/pull/330.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #330 ---- commit c7addea2d30af50bbee37665964eb60c789ed63b Author: Aashish Kumar Sharma <aashish.s@...> Date: 2018-04-04T08:18:03Z HIVE-19103: first commit commit 386cdb6292f0459e10e0d8473dd3b4b77002e334 Author: Aashish Kumar Sharma <aashish.s@...> Date: 2018-04-04T08:45:13Z HIVE-19103: second commit ---- > Reading required column only in nested structure schema in ORC > -------------------------------------------------------------- > > Key: HIVE-19103 > URL: https://issues.apache.org/jira/browse/HIVE-19103 > Project: Hive > Issue Type: Improvement > Reporter: Ashish Sharma > Assignee: Ashish Sharma > Priority: Major > Labels: pull-request-available > > Reading required columns only in nested structure schema > Example - > *Current state* - > Schema - struct<a:int, b:bigint,c:struct<d:int,e:struct<f:int>,g:string>> > Query - select c.e.f from t where c.e.f > 10; > Current state - read entire c struct from the file and then filter because > "hive.io.file.readcolumn.ids" is referred due to which all the children > column are select to read from the file. > Conf - > _hive.io.file.readcolumn.ids = "2" > hive.io.file.readNestedColumn.paths = "c.e.f"_ > Result - > boolean[ ] include = [true,false,false,true,true,true,true,true] > *Expected state* - > Schema - struct<a:int, b:bigint,c:struct<d:int,e:struct<f:int>,g:string>> > Query - select c.e.f from t where c.e.f > 10; > Expected state - instead of reading entire c struct from the file just read > only the f column by referring the " hive.io.file.readNestedColumn.paths". > Conf - > _hive.io.file.readcolumn.ids = "2" > hive.io.file.readNestedColumn.paths = "c.e.f"_ > Result - > boolean[ ] include = [true,false,false,true,false,true,true,false] -- This message was sent by Atlassian JIRA (v7.6.3#76005)