[ https://issues.apache.org/jira/browse/HIVE-19103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Work on HIVE-19103 started by Ashish Sharma. -------------------------------------------- > Reading required column only in nested structure schema in ORC > -------------------------------------------------------------- > > Key: HIVE-19103 > URL: https://issues.apache.org/jira/browse/HIVE-19103 > Project: Hive > Issue Type: Improvement > Reporter: Ashish Sharma > Assignee: Ashish Sharma > Priority: Major > Labels: pull-request-available > > Reading required columns only in nested structure schema > Example - > *Current state* - > Schema - struct<a:int, b:bigint,c:struct<d:int,e:struct<f:int>,g:string>> > Query - select c.e.f from t where c.e.f > 10; > Current state - read entire c struct from the file and then filter because > "hive.io.file.readcolumn.ids" is referred due to which all the children > column are select to read from the file. > Conf - > _hive.io.file.readcolumn.ids = "2" > hive.io.file.readNestedColumn.paths = "c.e.f"_ > Result - > boolean[ ] include = [true,false,false,true,true,true,true,true] > *Expected state* - > Schema - struct<a:int, b:bigint,c:struct<d:int,e:struct<f:int>,g:string>> > Query - select c.e.f from t where c.e.f > 10; > Expected state - instead of reading entire c struct from the file just read > only the f column by referring the " hive.io.file.readNestedColumn.paths". > Conf - > _hive.io.file.readcolumn.ids = "2" > hive.io.file.readNestedColumn.paths = "c.e.f"_ > Result - > boolean[ ] include = [true,false,false,true,false,true,true,false] -- This message was sent by Atlassian JIRA (v7.6.3#76005)