Hi, I have following questions:
1. When I write a spark script, how do I know what part runs on the driver side and what runs on the worker side. So lets say, I write code to to read a plain text file. Will it run on driver side only or will it run on server side only or on both sides 2. If I want each worker to load a file for lets say join and the file is pretty huge lets say in GBs, so that I don't want to broadcast it, then what's the best way to do it. Another way to say the same thing would be how do I load a data structure for fast lookup(and not an RDD) on each worker node in the executor Regards - Saurabh