Use Shared Variable in PySpark Executors

Soheil Pourbafrani Sat, 22 Sep 2018 08:34:36 -0700

Hi, I want to do some processing with PySpark and save the results in a
variable of type tuple that should be shared among the executors for
further processing.
Actually, it's a Text Mining Processing and I want to use the Vector Space
Model. So I want to calculate the Vector of all Words (that should be
reachable for all executors) and save it in a tuple. Is it possible in
Spark or I should use external storage like database or files?

Use Shared Variable in PySpark Executors

Reply via email to