hi,
Based on my testing, the memory cost is very different for
1. sql("select * from ...").groupby.agg
2. sql("select ... From ... Groupby ...").
For table.partition sized more than 500g, 2# run good, while outofmemory
happened in 1#. I am using the same spark configurations.
Could somebody te
Hi all,
Please give some suggestions. Thanks
With following same sql, spark sql and hive give different result. The sql is
sum(decimal(38,18)) columns.
Select sum(column) from table;
column is defined as decimal(38,18).
Spark version:1.5.3
Hive version:2.0.0
发自 网易邮箱大师
The output is:
Spark SQ:6828127
Hive:6980574.1269
发自 网易邮箱大师
在2016年04月20日 18:06,FangFang Chen 写道:
Hi all,
Please give some suggestions. Thanks
With following same sql, spark sql and hive give different result. The sql is
sum(decimal(38,18)) columns.
Select sum(column) from table;
column is
int data>=0.5 then to 1.
Is this a bug or some configuration thing? Please give some suggestions. Thanks
发自 网易邮箱大师
在2016年04月20日 18:45,FangFang Chen 写道:
The output is:
Spark SQ:6828127
Hive:6980574.1269
发自 网易邮箱大师
在2016年04月20日 18:06,FangFang Chen 写道:
Hi all,
Please give some suggestions. Thanks
type from decimal to decimal with precision.
Thanks
发自 网易邮箱大师
在2016年04月20日 20:47,Ted Yu 写道:
Do you mind trying out build from master branch ?
1.5.3 is a bit old.
On Wed, Apr 20, 2016 at 5:25 AM, FangFang Chen
wrote:
I found spark sql lost precision, and handle data as int with
Hi all,
With large sql command, job failed with following error. Please give your
suggestion on how to resolve it. Thanks
Sql file size: 676k
Log:
16/04/25 10:55:00 WARN TaskSetManager: Lost task 84.0 in stage 0.0 (TID 6,
BJHC-HADOOP-HERA-17493.jd.local): java.util.concurrent.ExecutionException