hi,
Based on my testing, the memory cost is very different for 
1. sql("select * from ...").groupby.agg 
2. sql("select ... From ... Groupby ...").


For table.partition sized more than 500g, 2# run good, while outofmemory 
happened in 1#. I am using the same spark configurations.
Could somebody tell why this happened? 


发自 网易邮箱大师

Reply via email to