Hi,

I am using persit before inserting dataframe data back into hive. This step
is adding 8 minutes to my total execution time.  is there a way to reduce
the total time without resulting in out of memory issues. Here is my code.

val datapoint_df: Dataset[Row] = sparkSession.sql(transposeHiveQry);
datapoint_df.persist(StorageLevel.MEMORY_AND_DISK)
datapoint_df.createOrReplaceTempView("ds_tmp")
sparkSession.sql(" insert overwrite table HiveTable select * from ds_tmp")

Thanks,
Asmath

Reply via email to