subject:"Duplicate rows when using group by in subquery"

Re: Duplicate rows when using group by in subquery

2013-09-19 Thread Yin Huai

e, Sep 17, 2013 at 2:24 AM, Mikael Öhman wrote: > > Thank you for the information. Just to be clear, it is not that I have > manually restricted the job to run using only a single mapreduce job, but > it incorrectly assumes one job is enough? > > I will get back with results fr

SV: Duplicate rows when using group by in subquery

2013-09-19 Thread Mikael Öhman

Just built from source this morning so seems strange that the bug would still persist :(. Från: Yin Huai Till: user@hive.apache.org; Mikael Öhman Skickat: tisdag, 17 september 2013 15:30 Ämne: Re: Duplicate rows when using group by in sub

Re: Duplicate rows when using group by in subquery

2013-09-17 Thread Yin Huai

måndag, 16 september 2013 19:52 > *Ämne:* Re: Duplicate rows when using group by in subquery > > Hello Mikael, > > Seems your case is related to the bug reported in > https://issues.apache.org/jira/browse/HIVE-5149. Basically, when hive > uses a single MapReduce job to evalu

SV: Duplicate rows when using group by in subquery

2013-09-16 Thread Mikael Öhman

lable until Thursday. / Sincerely Mikael Från: Yin Huai Till: user@hive.apache.org; Mikael Öhman Skickat: måndag, 16 september 2013 19:52 Ämne: Re: Duplicate rows when using group by in subquery Hello Mikael, Seems your case is related to the bug rep

Re: Duplicate rows when using group by in subquery

2013-09-16 Thread Yin Huai

Hello Mikael, Seems your case is related to the bug reported in https://issues.apache.org/jira/browse/HIVE-5149. Basically, when hive uses a single MapReduce job to evaluate your query, "c.Symbol" and "c.catid" are used to partitioning data, and thus, rows with the same value of "c.Symbol" are not

Duplicate rows when using group by in subquery

2013-09-16 Thread Mikael Öhman

Hello. This is basically the same question I posted on stackoverflow: http://stackoverflow.com/questions/18812390/hive-subquery-and-group-by/18818115?noredirect=1#18818115 I know the query is a bit noisy. But this query also demonstrates the error: select a.symbol from (select symbol, ordertype

Re: Duplicate rows when using group by in subquery

SV: Duplicate rows when using group by in subquery

Re: Duplicate rows when using group by in subquery

SV: Duplicate rows when using group by in subquery

Re: Duplicate rows when using group by in subquery

Duplicate rows when using group by in subquery

6 matches

Site Navigation

Mail list logo

Footer information