e, Sep 17, 2013 at 2:24 AM, Mikael Öhman wrote:
>
> Thank you for the information. Just to be clear, it is not that I have
> manually restricted the job to run using only a single mapreduce job, but
> it incorrectly assumes one job is enough?
>
> I will get back with results fr
Just built from source this morning so seems strange that the bug would
still persist :(.
Från: Yin Huai
Till: user@hive.apache.org; Mikael Öhman
Skickat: tisdag, 17 september 2013 15:30
Ämne: Re: Duplicate rows when using group by in sub
måndag, 16 september 2013 19:52
> *Ämne:* Re: Duplicate rows when using group by in subquery
>
> Hello Mikael,
>
> Seems your case is related to the bug reported in
> https://issues.apache.org/jira/browse/HIVE-5149. Basically, when hive
> uses a single MapReduce job to evalu
lable until Thursday.
/ Sincerely Mikael
Från: Yin Huai
Till: user@hive.apache.org; Mikael Öhman
Skickat: måndag, 16 september 2013 19:52
Ämne: Re: Duplicate rows when using group by in subquery
Hello Mikael,
Seems your case is related to the bug rep
Hello Mikael,
Seems your case is related to the bug reported in
https://issues.apache.org/jira/browse/HIVE-5149. Basically, when hive uses
a single MapReduce job to evaluate your query, "c.Symbol" and "c.catid" are
used to partitioning data, and thus, rows with the same value of "c.Symbol"
are not
Hello.
This is basically the same question I posted on stackoverflow:
http://stackoverflow.com/questions/18812390/hive-subquery-and-group-by/18818115?noredirect=1#18818115
I know the query is a bit noisy. But this query also demonstrates the error:
select a.symbol from (select symbol, ordertype