Statements are executed only when you try to cause some effect on the
server (produce data, collect data on driver). At time of execution Spark
does all the depedency resolution & truncates paths that dont go anywhere
as well as optimize execution pipelines. So you really dont have to worry
about t
Hello friends:
I have a theory question about call blocking in a Spark driver.
Consider this (admittedly contrived =:)) snippet to illustrate this question...
x = rdd01.reduceByKey() # or maybe some other 'shuffle-requiring action'.
b = sc.broadcast(x. take(20)) # Or any statement that r