I have a spark job that's running on a 10 node cluster and the python process on all the nodes is pegged at 100%.
I was wondering what parts of a spark script are run in the python process and which get passed to the Java processes? Is there any documentation on this? Thanks, Justin