Sorry if this is the wrong place for this. I am trying to debug an issue with this library: https://github.com/springml/spark-sftp
When I attempt to create a dataframe: spark.read. format("com.springml.spark.sftp"). option("host", "..."). option("username", "..."). option("password", "..."). option("fileType", "csv"). option("inferSchema", "true"). option("tempLocation","/srv/spark/tmp"). option("hdfsTempLocation","/srv/spark/tmp"); .load("...") What I am seeing is that the download is occurring on the spark driver not the spark worker, This leads to a failure when spark tries to create the DataFrame on the worker. I'm confused by the behavior. my understanding was that load() was lazily executed on the Spark worker. Why would some elements be executing on the driver? Thanks for your help -- Mark Bidewell http://www.linkedin.com/in/markbidewell