Is hive 0.13 index working fine on partition tables?

Jim Green Mon, 29 Jun 2015 14:36:11 -0700

Hi Team,

On hive 0.13, I have a minimum reproduce for index on partition table issue:
CREATE TABLE test_partition_index(
id1 bigint,
id2 bigint,
id3 bigint)
PARTITIONED BY (
dt string)
row format delimited fields terminated by ',';


cat sampledata
111,222,333

LOAD DATA LOCAL INPATH 'sampledata' OVERWRITE INTO TABLE
test_partition_index PARTITION (dt='20150101');
LOAD DATA LOCAL INPATH 'sampledata' OVERWRITE INTO TABLE
test_partition_index PARTITION (dt='20150102');

CREATE INDEX test_partition_index_idx ON TABLE test_partition_index (id1)
AS 'COMPACT' WITH DEFERRED REBUILD;
ALTER INDEX test_partition_index_idx ON test_partition_index REBUILD;
set hive.optimize.index.filter=true;
set hive.optimize.index.filter.compact.minsize=1;
select * from test_partition_index where dt in (20150101) and id1=111 ;

The error is:
Number of reduce tasks is set to 0 since there's no reduce operator
java.io.IOException: cannot find dir =
xxx:/user/hive/warehouse/test_partition_index/dt=20150102/sampledata in
pathToPartitionInfo:
[xxx:/user/hive/warehouse/test_partition_index/dt=20150101]
at
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:344)
at
org.apache.hadoop.hive.ql.index.HiveIndexedInputFormat.doGetSplits(HiveIndexedInputFormat.java:81)
at
org.apache.hadoop.hive.ql.index.HiveIndexedInputFormat.getSplits(HiveIndexedInputFormat.java:149)
at
org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:520)
at
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:512)
at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:135)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1508)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1275)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1093)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:916)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:906)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Is this issue fixed in latest version of Hive?
If so, which JIRA is related?
Thanks.

-- 
Thanks,
www.openkb.info
(Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)

Is hive 0.13 index working fine on partition tables?

Reply via email to