Hi, I am using persit before inserting dataframe data back into hive. This step is adding 8 minutes to my total execution time. is there a way to reduce the total time without resulting in out of memory issues. Here is my code.
val datapoint_df: Dataset[Row] = sparkSession.sql(transposeHiveQry); datapoint_df.persist(StorageLevel.MEMORY_AND_DISK) datapoint_df.createOrReplaceTempView("ds_tmp") sparkSession.sql(" insert overwrite table HiveTable select * from ds_tmp") Thanks, Asmath