-24 9:42 GMT+08:00 panfei :
> by decreasing mapreduce.reduce.shuffle.parallelcopies from 20 to 5, it
> seems that everything goes well, no OOM ~~
>
> 2017-08-23 17:19 GMT+08:00 panfei :
>
>> The full error stack is (which described here :
>> https://issues.apache.org
by decreasing mapreduce.reduce.shuffle.parallelcopies from 20 to 5, it
seems that everything goes well, no OOM ~~
2017-08-23 17:19 GMT+08:00 panfei :
> The full error stack is (which described here : https://issues.apache.org/
> jira/browse/MAPREDUCE-6108) :
>
> this error can n
] org.apache.hadoop.mapred.Task:
Runnning cleanup for the task
2017-08-23 13:10 GMT+08:00 panfei :
> Hi Gopal, Thanks for all the information and suggestion.
>
> The Hive version is 2.0.1 and use Hive-on-MR as the execution engine.
>
> I think I should create a intermediate table whi
Hi Gopal, Thanks for all the information and suggestion.
The Hive version is 2.0.1 and use Hive-on-MR as the execution engine.
I think I should create a intermediate table which includes all the
dimensions (including the serval kinds of ids), and then use spark-sql to
calculate the distinct value
-- Forwarded message --
From: panfei
Date: 2017-08-23 12:26 GMT+08:00
Subject: Fwd: How to optimize multiple count( distinct col) in Hive SQL
To: hive-...@hadoop.apache.org
-- Forwarded message --
From: panfei
Date: 2017-08-23 12:26 GMT+08:00
Subject: How to