Re: SARG predicate is ignored when query ORC table

2016-02-29 Thread Jie Zhang
Hi, Prasanth, Thanks for sharing the insight. "Before hive 0.14 the default stripe size of ORC was 256MB and hdfs block size is calculated based Math.min(2*stripe_size,1.5GB). So typically block size is 512MB. When the entire file is less than a block it is not beneficial to read the footer to e

Re: SARG predicate is ignored when query ORC table

2016-02-29 Thread Mich Talebzadeh
Hi Prasanth. I am using Hive 2 and notice that the file elimination happens when the table is bucketed and SARG happens to be part of that bucket. In that case optimizer goes to the correct bucket. Otherwise it seems to be a full table scan meaning that every file of the table is checked. It wou

Re: SARG predicate is ignored when query ORC table

2016-02-28 Thread Prasanth Jayachandran
Hi Please find answers inline. On Feb 28, 2016, at 2:50 AM, Mich Talebzadeh mailto:mich.talebza...@gmail.com>> wrote: Hi Jessica, Interesting. The ORC files are laid out in stripes that are specified by orc.stripe.size (default 64MB). Within each stripe you have row groups of 10K rows that

Re: SARG predicate is ignored when query ORC table

2016-02-28 Thread Mich Talebzadeh
Hi Jessica, Interesting. The ORC files are laid out in stripes that are specified by *orc.stripe.size* (default 64MB). Within each stripe you have row groups of 10K rows that keep statistics for both data and index. Your query should perform a SARG pushdown that limits which rows are required for

Re: SARG predicate is ignored when query ORC table

2016-02-27 Thread Jie Zhang
Hi, Mich, Thanks for the reply. We don't set any tblproperties when creating table. Here is the TBLPROPERTIES part from show create table: STORED AS ORC TBLPROPERTIES ('transient_lastDdlTime'='1455765074') Jessica On Sat, Feb 27, 2016 at 11:15 AM, Mich Talebzadeh wrote: > Hi, > > Can you do

Re: SARG predicate is ignored when query ORC table

2016-02-27 Thread Mich Talebzadeh
Hi, Can you do show create table on your external table and send the sections from STORED AS ORC TBLPROPERTIES ( onwards please? HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

SARG predicate is ignored when query ORC table

2016-02-27 Thread Jie Zhang
Hi, We have an external ORC table which includes ~200 relatively small orc files (less than 256MB). When querying the table with selective SARG predicate (explain shows the predicate is qualified pushdown), we expects a few splits generated with pruning based on predicate condition and only a few