from:"FangFang Chen"

dataframe.groupby.agg vs sql("select from groupby)")

2016-03-10 Thread FangFang Chen

hi, Based on my testing, the memory cost is very different for 1. sql("select * from ...").groupby.agg 2. sql("select ... From ... Groupby ..."). For table.partition sized more than 500g, 2# run good, while outofmemory happened in 1#. I am using the same spark configurations. Could somebody te

Spark sql and hive into different result with same sql

2016-04-20 Thread FangFang Chen

Hi all, Please give some suggestions. Thanks With following same sql, spark sql and hive give different result. The sql is sum(decimal(38,18)) columns. Select sum(column) from table; column is defined as decimal(38,18). Spark version:1.5.3 Hive version:2.0.0 发自网易邮箱大师

回复：Spark sql and hive into different result with same sql

2016-04-20 Thread FangFang Chen

The output is: Spark SQ:6828127 Hive:6980574.1269 发自网易邮箱大师在2016年04月20日 18:06，FangFang Chen 写道: Hi all, Please give some suggestions. Thanks With following same sql, spark sql and hive give different result. The sql is sum(decimal(38,18)) columns. Select sum(column) from table; column is

回复：回复：Spark sql and hive into different result with same sql

2016-04-20 Thread FangFang Chen

int data>=0.5 then to 1. Is this a bug or some configuration thing? Please give some suggestions. Thanks 发自网易邮箱大师在2016年04月20日 18:45，FangFang Chen 写道: The output is: Spark SQ:6828127 Hive:6980574.1269 发自网易邮箱大师在2016年04月20日 18:06，FangFang Chen 写道: Hi all, Please give some suggestions. Thanks

回复：Re: 回复：Spark sql and hive into different result with same sql

2016-04-20 Thread FangFang Chen

type from decimal to decimal with precision. Thanks 发自网易邮箱大师在2016年04月20日 20:47，Ted Yu 写道: Do you mind trying out build from master branch ? 1.5.3 is a bit old. On Wed, Apr 20, 2016 at 5:25 AM, FangFang Chen wrote: I found spark sql lost precision, and handle data as int with

Spark sql with large sql syntax job failed with outofmemory error and grows beyond 64k warn

2016-04-24 Thread FangFang Chen

Hi all, With large sql command, job failed with following error. Please give your suggestion on how to resolve it. Thanks Sql file size: 676k Log: 16/04/25 10:55:00 WARN TaskSetManager: Lost task 84.0 in stage 0.0 (TID 6, BJHC-HADOOP-HERA-17493.jd.local): java.util.concurrent.ExecutionException

dataframe.groupby.agg vs sql("select from groupby)")

Spark sql and hive into different result with same sql

回复：Spark sql and hive into different result with same sql

回复：回复：Spark sql and hive into different result with same sql

回复：Re: 回复：Spark sql and hive into different result with same sql

Spark sql with large sql syntax job failed with outofmemory error and grows beyond 64k warn

6 matches

Site Navigation

Mail list logo

Footer information