They should be identical. Can you paste the detailed explain output.
On Thursday, March 10, 2016, FangFang Chen <[email protected]>
wrote:
> hi,
> Based on my testing, the memory cost is very different for
> 1. sql("select * from ...").groupby.agg
> 2. sql("select ... From ... Groupby ...").
>
> For table.partition sized more than 500g, 2# run good, while outofmemory
> happened in 1#. I am using the same spark configurations.
> Could somebody tell why this happened?
>
> 发自 网易邮箱大师 <http://u.163.com/signature>
>
>
>