Doing:

> set hive.auto.convert.join.noconditionaltask=false;

makes it work (though it does way more map reduce jobs than it should).  When I 
get some time I will test against the latest trunk.

Thanks,
Nate


On Sep 3, 2013, at 6:09 PM, Yin Huai <huaiyin....@gmail.com> wrote:

> Based on the log, it may be also related to 
> https://issues.apache.org/jira/browse/HIVE-4927. To make it work (in a not 
> very optimized way), can you try "set 
> hive.auto.convert.join.noconditionaltask=false;" ? If you still get the 
> error, give "set hive.auto.convert.join=false;" a try (it will turn off map 
> join auto convert, so you will use reduce-side join). 
> 
> Thanks,
> 
> Yin
> 
> 
> On Tue, Sep 3, 2013 at 6:03 PM, Ashutosh Chauhan <hashut...@apache.org> wrote:
> Not sure about EMR. Your best bet is to ask on EMR forums.
> 
> Thanks,
> Ashutosh
> 
> 
> On Tue, Sep 3, 2013 at 2:18 PM, Nathanial Thelen <n...@natethelen.com> wrote:
> Is there a way to run a patch on EMR?
> 
> Thanks,
> Nate
> 
> On Sep 3, 2013, at 2:14 PM, Ashutosh Chauhan <hashut...@apache.org> wrote:
> 
>> Fix in very related area has been checked in trunk today : 
>> https://issues.apache.org/jira/browse/HIVE-5129 Likely that will fix your 
>> issue. 
>> Can you try latest trunk?
>> 
>> Ashutosh
>> 
>> 
>> On Tue, Sep 3, 2013 at 2:03 PM, Nathanial Thelen <n...@natethelen.com> wrote:
>> I am running Hive in EMR and since upgrading to 0.11 from 0.8.1.8 I have 
>> been getting NullPointerExceptions (NPE) for certain queries in our staging 
>> environment.  Only difference between stage and production is the amount of 
>> traffic we get so the data set is much smaller.  We are not using any custom 
>> code.
>> 
>> I have greatly simplified the query down to the bare minimum that will cause 
>> the error:
>> 
>> SELECT
>>     count(DISTINCT ag.adGroupGuid) as groups,
>>     count(DISTINCT av.adViewGuid) as ads,
>>     count(DISTINCT ac.adViewGuid) as uniqueClicks
>> FROM
>>     adgroup ag
>>     INNER JOIN adview av ON av.adGroupGuid = ag.adGroupGuid
>>     LEFT OUTER JOIN adclick ac ON ac.adViewGuid = av.adViewGuid
>> 
>> This will return the following before any Map Reduce jobs start:
>> 
>> FAILED: NullPointerException null
>> 
>> Looking in the hive log at /mnt/var/log/apps/hive_0110.log and scanning, I 
>> see this error:
>> 
>> 2013-09-03 18:09:19,796 INFO  org.apache.hadoop.hive.ql.exec.Utilities 
>> (Utilities.java:getInputSummary(1889)) - Cache Content Summary for 
>> s3://{ourS3Bucket}/hive/data/stage/adgroup/year=2013/month=08/day=29 length: 
>> 94324 file count: 20 directory count: 1
>> 2013-09-03 18:09:19,796 INFO  org.apache.hadoop.hive.ql.exec.Utilities 
>> (Utilities.java:getInputSummary(1889)) - Cache Content Summary for 
>> s3://{ourS3Bucket}/hive/data/stage/adview/year=2013/month=08/day=30 length: 
>> 142609 file count: 21 directory count: 1
>> 2013-09-03 18:09:19,796 INFO  org.apache.hadoop.hive.ql.exec.Utilities 
>> (Utilities.java:getInputSummary(1889)) - Cache Content Summary for 
>> s3://{ourS3Bucket}/hive/data/stage/adgroup/year=2013/month=08/day=30 length: 
>> 65519 file count: 21 directory count: 1
>> 2013-09-03 18:09:19,796 INFO  org.apache.hadoop.hive.ql.exec.Utilities 
>> (Utilities.java:getInputSummary(1889)) - Cache Content Summary for 
>> s3://{ourS3Bucket}/hive/data/stage/adview/year=2013/month=08/day=29 length: 
>> 205096 file count: 20 directory count: 1
>> 2013-09-03 18:09:19,800 INFO  
>> org.apache.hadoop.hive.ql.optimizer.physical.MetadataOnlyOptimizer 
>> (MetadataOnlyOptimizer.java:dispatch(267)) - Looking for table scans where 
>> optimization is applicable
>> 2013-09-03 18:09:19,801 INFO  
>> org.apache.hadoop.hive.ql.optimizer.physical.MetadataOnlyOptimizer 
>> (MetadataOnlyOptimizer.java:dispatch(301)) - Found 0 metadata only table 
>> scans
>> 2013-09-03 18:09:19,801 INFO  
>> org.apache.hadoop.hive.ql.optimizer.physical.MetadataOnlyOptimizer 
>> (MetadataOnlyOptimizer.java:dispatch(267)) - Looking for table scans where 
>> optimization is applicable
>> 2013-09-03 18:09:19,801 INFO  
>> org.apache.hadoop.hive.ql.optimizer.physical.MetadataOnlyOptimizer 
>> (MetadataOnlyOptimizer.java:dispatch(301)) - Found 1 metadata only table 
>> scans
>> 2013-09-03 18:09:19,801 ERROR org.apache.hadoop.hive.ql.Driver 
>> (SessionState.java:printError(386)) - FAILED: NullPointerException null
>> java.lang.NullPointerException
>>      at 
>> org.apache.hadoop.hive.ql.optimizer.physical.MetadataOnlyOptimizer$MetadataOnlyTaskDispatcher.dispatch(MetadataOnlyOptimizer.java:308)
>>      at 
>> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:87)
>>      at 
>> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:124)
>>      at 
>> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:101)
>>      at 
>> org.apache.hadoop.hive.ql.optimizer.physical.MetadataOnlyOptimizer.resolve(MetadataOnlyOptimizer.java:175)
>>      at 
>> org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:79)
>>      at 
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genMapRedTasks(SemanticAnalyzer.java:8426)
>>      at 
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8789)
>>      at 
>> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:278)
>>      at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:433)
>>      at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
>>      at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
>>      at 
>> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:310)
>>      at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:231)
>>      at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:466)
>>      at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:819)
>>      at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:674)
>>      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>      at 
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>      at 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>      at java.lang.reflect.Method.invoke(Method.java:606)
>>      at org.apache.hadoop.util.RunJar.main(RunJar.java:187)
>> 
>> The same error also happens if I do an INNER JOIN to adclick, FYI.
>> 
>> I have checked that there are not any null values for any of the columns 
>> referenced in the query. 
>> 
>> Making almost any changes to the query results it in successfully running.  
>> Here are some I have tried:
>> 
>> Removed JOIN to adgroup:
>> 
>> SELECT
>>     count(DISTINCT av.adViewGuid) as ads,
>>     count(DISTINCT ac.adViewGuid) as uniqueClicks
>> FROM
>>     adview av
>>     LEFT OUTER JOIN adclick ac ON ac.adViewGuid = av.adViewGuid;
>> 
>> WORKS:
>> 561  6
>> 
>> Removed JOIN to adclick:
>> 
>> SELECT
>>     count(DISTINCT ag.adGroupGuid) as groups,
>>     count(DISTINCT av.adViewGuid) as ads
>> FROM
>>     adgroup ag
>>     INNER JOIN adview av ON av.adGroupGuid = ag.adGroupGuid;
>> 
>> WORKS:
>> 543  561
>> 
>> Removing DISTINCT from any of the 3 counts
>> 
>> SELECT
>>     count(DISTINCT ag.adGroupGuid) as groups,
>>     count(DISTINCT av.adViewGuid) as ads,
>>     count(ac.adViewGuid) as uniqueClicks
>> FROM
>>     adgroup ag
>>     INNER JOIN adview av ON av.adGroupGuid = ag.adGroupGuid
>>     LEFT OUTER JOIN adclick ac ON ac.adViewGuid = av.adViewGuid;
>> 
>> WORKS:
>> 543  561     7
>> 
>> 
>> SELECT
>>     count(ag.adGroupGuid) as groups,
>>     count(DISTINCT av.adViewGuid) as ads,
>>     count(DISTINCT ac.adViewGuid) as uniqueClicks
>> FROM
>>     adgroup ag
>>     INNER JOIN adview av ON av.adGroupGuid = ag.adGroupGuid
>>     LEFT OUTER JOIN adclick ac ON ac.adViewGuid = av.adViewGuid;
>> 
>> WORKS:
>> 562  561     6
>> 
>> 
>> SELECT
>>     count(DISTINCT ag.adGroupGuid) as groups,
>>     count(av.adViewGuid) as ads,
>>     count(DISTINCT ac.adViewGuid) as uniqueClicks
>> FROM
>>     adgroup ag
>>     INNER JOIN adview av ON av.adGroupGuid = ag.adGroupGuid
>>     LEFT OUTER JOIN adclick ac ON ac.adViewGuid = av.adViewGuid;
>> 
>> WORKS:
>> 543  562     6 
>> 
>> I am not exactly sure what to do next.  Thoughts?
>> 
>> Nate
>> 
> 
> 
> 

Reply via email to