subject:"Re\: DISTRIBUTE BY works incorrectly in Hive 0.11 in some cases"

Re: DISTRIBUTE BY works incorrectly in Hive 0.11 in some cases

2013-08-26 Thread Pala M Muthaia

Thanks for following up Yin. We realized later this was due to the reduce deduplication optimization, and found turning off the flag avoids the issue. -pala On Mon, Aug 26, 2013 at 4:40 AM, Yin Huai wrote: > forgot to add in my last reply To generate correct results, you can > set hive.op

Re: DISTRIBUTE BY works incorrectly in Hive 0.11 in some cases

2013-08-26 Thread Yin Huai

forgot to add in my last reply To generate correct results, you can set hive.optimize.reducededuplication to false to turn off ReduceSinkDeDuplication On Sun, Aug 25, 2013 at 9:35 PM, Yin Huai wrote: > Created a jira https://issues.apache.org/jira/browse/HIVE-5149 > > > On Sun, Aug 25, 2013

Re: DISTRIBUTE BY works incorrectly in Hive 0.11 in some cases

2013-08-25 Thread Yin Huai

Created a jira https://issues.apache.org/jira/browse/HIVE-5149 On Sun, Aug 25, 2013 at 9:11 PM, Yin Huai wrote: > Seems ReduceSinkDeDuplication picked the wrong partitioning columns. > > > On Fri, Aug 23, 2013 at 9:15 PM, Shahansad KP wrote: > >> I think the problem lies with in the group by o

Re: DISTRIBUTE BY works incorrectly in Hive 0.11 in some cases

2013-08-25 Thread Yin Huai

Seems ReduceSinkDeDuplication picked the wrong partitioning columns. On Fri, Aug 23, 2013 at 9:15 PM, Shahansad KP wrote: > I think the problem lies with in the group by operation. For this > optimization to work the group bys partitioning should be on the column 1 > only. > > It wont effect th

Re: DISTRIBUTE BY works incorrectly in Hive 0.11 in some cases

2013-08-23 Thread Shahansad KP

I think the problem lies with in the group by operation. For this optimization to work the group bys partitioning should be on the column 1 only. It wont effect the correctness of group by, can make it slow but int this case will fasten the overall query performance. On Fri, Aug 23, 2013 at 5:55

Re: DISTRIBUTE BY works incorrectly in Hive 0.11 in some cases

2013-08-23 Thread Pala M Muthaia

I have attached the hive 10 and 11 query plans, for the sample query below, for illustration. On Fri, Aug 23, 2013 at 5:35 PM, Pala M Muthaia wrote: > Hi, > > We are using DISTRIBUTE BY with custom reducer scripts in our query > workload. > > After upgrade to Hive 0.11, queries with GROUP BY/DI

Re: DISTRIBUTE BY works incorrectly in Hive 0.11 in some cases

Re: DISTRIBUTE BY works incorrectly in Hive 0.11 in some cases

Re: DISTRIBUTE BY works incorrectly in Hive 0.11 in some cases

Re: DISTRIBUTE BY works incorrectly in Hive 0.11 in some cases

Re: DISTRIBUTE BY works incorrectly in Hive 0.11 in some cases

Re: DISTRIBUTE BY works incorrectly in Hive 0.11 in some cases

6 matches

Site Navigation

Mail list logo

Footer information