Skew join failure

David Morel Fri, 30 Nov 2012 02:11:27 -0800

Hi,

I am trying to solve the "last reducer hangs because of GC because oftruckloads of data" issue that I have on some queries, by using SEThive.optimize.skewjoin=true; Unfortunately, every time I try this, Iencounter an error of the form:

...

2012-11-30 10:42:39,181 Stage-10 map = 100%, reduce = 100%, CumulativeCPU 406984.1 secMapReduce Total cumulative CPU time: 4 days 17 hours 3 minutes 4 seconds100 msec

Ended Job = job_201211281801_0463

java.io.FileNotFoundException: Filehdfs://nameservice1/tmp/hive-dmorel/hive_2012-11-30_09-23-00_375_8178040921995939301/-mr-10014/hive_skew_join_bigkeys_0does not exist.atorg.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:365)atorg.apache.hadoop.hive.ql.plan.ConditionalResolverSkewJoin.getTasks(ConditionalResolverSkewJoin.java:96)atorg.apache.hadoop.hive.ql.exec.ConditionalTask.execute(ConditionalTask.java:81)atorg.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)atorg.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)atorg.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)

        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
...

Googling didn't give me any indication on how to debug/solve this, soI'd be glad if I could get any indication where to start looking.


I'm using CMF4.0 currently, so Hive 0.8.1.

Thanks a lot!

David Morel

Skew join failure

Reply via email to