Hi, That¹s expected behaviour since you are comparing a Timestamp to a string.
Timestamp >= String is being skipped because the SARGs need to be the same type to offer non-equality comparisons accurately. https://issues.apache.org/jira/browse/HIVE-10286 I logged the bug after I hit bugs with PPD for that case when using ORC APIs from outside Hive (i.e ³1² < ³9² and ³11² < ³9²). That was a mistake anyone could¹ve made while hand-creating SARGs, but I wanted to make it better for the next person who might miss it and bail out without PPD when the arguments don¹t match PredicateLeaf.Type. You can try the same with something where hive does the right thing with a Filter expression hive> create temporary table xx(x int) stored as orc; hive> insert into xx values (1),(9),(11); hive> select * from xy where x > 9¹; Cheers, Gopal On 6/1/15, 7:21 PM, "Alexander Pivovarov" <apivova...@gmail.com> wrote: >if hive.optimize.index.filter is enabled then it causes the following the >following stacktraces > >------------------------------------------------------ >create table ts (ts timestamp); >insert into table ts values('2015-01-01 00:00:00'); > >set hive.optimize.index.filter=true; >select * from ts where ts >= '2015-01-01 00:00:00'; >------------------------------------------------------ > > >-- HIVE-1.3.0 ---------------------------------------------------- >OK >15/06/01 19:07:08 [main]: INFO ql.Driver: OK >15/06/01 19:07:08 [main]: INFO log.PerfLogger: <PERFLOG >method=releaseLocks >from=org.apache.hadoop.hive.ql.Driver> >15/06/01 19:07:08 [main]: INFO log.PerfLogger: </PERFLOG >method=releaseLocks start=1433210828865 end=1433210828865 duration=0 >from=org.apache.hadoop.hive.ql.Driver> >15/06/01 19:07:08 [main]: INFO log.PerfLogger: </PERFLOG method=Driver.run >start=1433210828758 end=1433210828865 duration=107 >from=org.apache.hadoop.hive.ql.Driver> >15/06/01 19:07:08 [main]: INFO log.PerfLogger: <PERFLOG >method=OrcGetSplits >from=org.apache.hadoop.hive.ql.io.orc.ReaderImpl> >15/06/01 19:07:08 [main]: INFO orc.OrcInputFormat: FooterCacheHitRatio: >0/0 >15/06/01 19:07:08 [main]: INFO log.PerfLogger: </PERFLOG >method=OrcGetSplits start=1433210828870 end=1433210828876 duration=6 >from=org.apache.hadoop.hive.ql.io.orc.ReaderImpl> >15/06/01 19:07:08 [main]: INFO orc.OrcInputFormat: ORC pushdown predicate: >leaf-0 = (LESS_THAN ts 2015-01-01 00:00:00) >expr = (not leaf-0) >15/06/01 19:07:08 [main]: INFO orc.OrcRawRecordMerger: min key = null, max >key = null >15/06/01 19:07:08 [main]: INFO orc.ReaderImpl: Reading ORC rows from >hdfs://localhost/apps/apivovarov/warehouse/ts/000000_0 with {include: >[true, true], offset: 0, length: 9223372036854775807, sarg: leaf-0 = >(LESS_THAN ts 2015-01-01 00:00:00) >expr = (not leaf-0), columns: ['null', 'ts']} >15/06/01 19:07:08 [main]: WARN orc.RecordReaderImpl: Exception when >evaluating predicate. Skipping ORC PPD. Exception: >java.lang.IllegalArgumentException: ORC SARGS could not convert from >String >to TIMESTAMP > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.getBaseObjectForComparis >on(RecordReaderImpl.java:659) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.evaluatePredicateRange(R >ecordReaderImpl.java:373) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.evaluatePredicateProto(R >ecordReaderImpl.java:338) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$SargApplier.pickRowGroup >s(RecordReaderImpl.java:711) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.pickRowGroups(RecordRead >erImpl.java:752) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readStripe(RecordReaderI >mpl.java:778) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceStripe(RecordRead >erImpl.java:987) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceToNextRow(RecordR >eaderImpl.java:1020) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.<init>(RecordReaderImpl. >java:205) > at >org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:53 >9) > at >org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$ReaderPair.<init>(OrcR >awRecordMerger.java:183) > at >org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$OriginalReaderPair.<in >it>(OrcRawRecordMerger.java:226) > at >org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.<init>(OrcRawRecordMer >ger.java:437) > at >org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.j >ava:1219) > at >org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFo >rmat.java:1117) > at >org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getReco >rdReader(FetchOperator.java:673) > at >org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator >.java:323) > at >org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java >:445) > at >org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:41 >4) > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140) > at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1671) > at >org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) > at >org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > at >org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: >57) > at >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm >pl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > >15/06/01 19:07:08 [main]: INFO orc.OrcInputFormat: ORC pushdown predicate: >leaf-0 = (LESS_THAN ts 2015-01-01 00:00:00) >expr = (not leaf-0) >15/06/01 19:07:08 [main]: INFO orc.OrcRawRecordMerger: min key = null, max >key = null >15/06/01 19:07:08 [main]: INFO orc.ReaderImpl: Reading ORC rows from >hdfs://localhost/apps/apivovarov/warehouse/ts/000000_0_copy_1 with >{include: [true, true], offset: 0, length: 9223372036854775807, sarg: >leaf-0 = (LESS_THAN ts 2015-01-01 00:00:00) >expr = (not leaf-0), columns: ['null', 'ts']} >15/06/01 19:07:08 [main]: WARN orc.RecordReaderImpl: Exception when >evaluating predicate. Skipping ORC PPD. Exception: >java.lang.IllegalArgumentException: ORC SARGS could not convert from >String >to TIMESTAMP > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.getBaseObjectForComparis >on(RecordReaderImpl.java:659) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.evaluatePredicateRange(R >ecordReaderImpl.java:373) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.evaluatePredicateProto(R >ecordReaderImpl.java:338) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$SargApplier.pickRowGroup >s(RecordReaderImpl.java:711) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.pickRowGroups(RecordRead >erImpl.java:752) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readStripe(RecordReaderI >mpl.java:778) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceStripe(RecordRead >erImpl.java:987) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceToNextRow(RecordR >eaderImpl.java:1020) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.<init>(RecordReaderImpl. >java:205) > at >org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:53 >9) > at >org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$ReaderPair.<init>(OrcR >awRecordMerger.java:183) > at >org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$OriginalReaderPair.<in >it>(OrcRawRecordMerger.java:226) > at >org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.<init>(OrcRawRecordMer >ger.java:437) > at >org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.j >ava:1219) > at >org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFo >rmat.java:1117) > at >org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getReco >rdReader(FetchOperator.java:673) > at >org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator >.java:323) > at >org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java >:445) > at >org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:41 >4) > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140) > at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1671) > at >org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) > at >org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > at >org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: >57) > at >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm >pl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > >2015-01-01 00:00:00 >15/06/01 19:07:08 [main]: INFO exec.TableScanOperator: 0 finished. >closing... >15/06/01 19:07:08 [main]: INFO exec.FilterOperator: 4 finished. closing... >15/06/01 19:07:08 [main]: INFO exec.SelectOperator: 2 finished. closing... >15/06/01 19:07:08 [main]: INFO exec.ListSinkOperator: 5 finished. >closing... >15/06/01 19:07:08 [main]: INFO exec.ListSinkOperator: 5 Close done >15/06/01 19:07:08 [main]: INFO exec.SelectOperator: 2 Close done >15/06/01 19:07:08 [main]: INFO exec.FilterOperator: 4 Close done >15/06/01 19:07:08 [main]: INFO exec.TableScanOperator: 0 Close done >Time taken: 0.108 seconds, Fetched: 1 row(s) > > >-- HIVE-0.14 >-------------------------------------------------------------------------- >------ >hive.optimize.index.filter is enabled by default in some hive distros >(e.g. >HDP 2.2.4 (hive-0.14)) > >hive-0.14 exists with the following error: > >15/06/01 19:16:18 [main]: INFO orc.ReaderImpl: Reading ORC rows from >hdfs://localhost/apps/apivovarov/warehouse/ts/000000_0 with {include: >[true, true], offset: 0, length: 9223372036854775807, sarg: leaf-0 = >(LESS_THAN ts 2015-01-01 00:00:00) >expr = (not leaf-0), columns: ['null', 'ts']} >Failed with exception java.io.IOException:java.lang.ClassCastException: >java.sql.Timestamp cannot be cast to java.lang.String >15/06/01 19:16:18 [main]: ERROR CliDriver: Failed with exception >java.io.IOException:java.lang.ClassCastException: java.sql.Timestamp >cannot >be cast to java.lang.String >java.io.IOException: java.lang.ClassCastException: java.sql.Timestamp >cannot be cast to java.lang.String > at >org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java >:663) > at >org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:56 >1) > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138) > at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1623) > at >org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:267) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199) > at >org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410) > at >org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:345) > at >org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:443) > at >org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:459) > at >org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:739) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: >57) > at >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm >pl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) >Caused by: java.lang.ClassCastException: java.sql.Timestamp cannot be cast >to java.lang.String > at java.lang.String.compareTo(String.java:108) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.compareToRange(RecordRea >derImpl.java:2341) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.evaluatePredicateRange(R >ecordReaderImpl.java:2475) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.evaluatePredicate(Record >ReaderImpl.java:2429) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.pickRowGroups(RecordRead >erImpl.java:2625) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readStripe(RecordReaderI >mpl.java:2688) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceStripe(RecordRead >erImpl.java:3125) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceToNextRow(RecordR >eaderImpl.java:3167) > at >org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.<init>(RecordReaderImpl. >java:294) > at >org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:53 >4) > at >org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$ReaderPair.<init>(OrcR >awRecordMerger.java:183) > at >org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$OriginalReaderPair.<in >it>(OrcRawRecordMerger.java:226) > at >org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.<init>(OrcRawRecordMer >ger.java:437) > at >org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.j >ava:1141) > at >org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFo >rmat.java:1039) > at >org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator >.java:498) > at >org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java >:588) > ... 18 more > >15/06/01 19:16:18 [main]: INFO exec.TableScanOperator: 0 finished. >closing... >15/06/01 19:16:18 [main]: INFO exec.FilterOperator: 4 finished. closing... >15/06/01 19:16:18 [main]: INFO exec.SelectOperator: 2 finished. closing... >15/06/01 19:16:18 [main]: INFO exec.ListSinkOperator: 5 finished. >closing... >15/06/01 19:16:18 [main]: INFO exec.ListSinkOperator: 5 Close done >15/06/01 19:16:18 [main]: INFO exec.SelectOperator: 2 Close done >15/06/01 19:16:18 [main]: INFO exec.FilterOperator: 4 Close done >15/06/01 19:16:18 [main]: INFO exec.TableScanOperator: 0 Close done >Time taken: 1.812 seconds >15/06/01 19:16:18 [main]: INFO CliDriver: Time taken: 1.812 seconds >15/06/01 19:16:18 [main]: INFO log.PerfLogger: <PERFLOG >method=releaseLocks >from=org.apache.hadoop.hive.ql.Driver> >15/06/01 19:16:18 [main]: INFO log.PerfLogger: </PERFLOG >method=releaseLocks start=1433211378347 end=1433211378347 duration=0 >from=org.apache.hadoop.hive.ql.Driver> >[apivovarov@c11 apache-hive-0.14.0-bin]$ echo $? >1