Is sparkSession.sql now an action in Spark 3 and later?

2023-02-08 Thread Sayeh Roshan
Hi, I remember previously that spark.sql() wasn’t a final action and you would have needed to run something like show() for the query to actually being performed. Today I noticed that when I do just spark.sql() without show() or anything , lots of executors are being fired and reading their logs sh

Reading the last line of each file in a set of text files

2021-08-02 Thread Sayeh Roshan
Hi users, Does anyone here has experience with written spark code that just read the last line of each text file in a directory, s3 bucket, etc? I am looking for a solution that doesn’t require reading the whole file. I basically wonder whether you can create a data frame/Rdd using file seek. Not s