Hello, I was looking at Spark streaming UI and noticed a big difference between "Processing time" and "Job duration"
[image: Inline image 1] Processing time/Output Op duration is show as 50s but sum of all job duration is ~25s. What is causing this difference? Based on logs I know that the batch actually took 50s. [image: Inline image 2] The job that is taking most of time is joinRDD.toDS() .write.format("com.databricks.spark.csv") .mode(SaveMode.Append) .options(Map("mode" -> "DROPMALFORMED", "delimiter" -> "\t", "header" -> "false")) .partitionBy("entityId", "regionId", "eventDate") .save(outputPath) Removing SaveMode.Append really speeds things up and also the mismatch between Job duration and processing time disappears. I'm not able to explain what is causing this though. Srikanth