Hi, I want to use pySpark with yarn. But documentation doesn't give me full
understanding on what's going on, and I simply don't understand code. So:

1) How python shipped to cluster? Should machines in cluster already have
python?
2) What happens when I write some python code in "map" function - is it
shipped to cluster and just executed on it? How it understand all
dependencies, which my code need and ship it there? If I use Math in my
code in "map" does it mean, that I would ship Math class or some python
Math on cluster would be used?
3) I have c++ compiled code. Can I ship this executable with "addPyFile"
and just use "exec" function from python? Would it work?



-- 



*Sincerely yoursEgor PakhomovScala Developer, Yandex*

Reply via email to