Spark hive overwrite is very very slow

KhajaAsmath Mohammed Sat, 19 Aug 2017 18:19:19 -0700

Hi,

I have written spark sql job on spark2.0 by using scala . It is just pulling 
the data from hive table and add extra columns , remove duplicates and then 
write it back to hive again.


In spark ui, it is taking almost 40 minutes to write 400 go of data. Is there 
anything that I need to improve performance .

Spark.sql.partitions is 2000 in my case with executor memory of 16gb and 
dynamic allocation enabled.

I am doing insert overwrite on partition by
Da.write.mode(overwrite).insertinto(table)

Any suggestions please ??

Sent from my iPhone
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Spark hive overwrite is very very slow

Reply via email to