Hi Kostas, Environment variable PYSPARK_PYTHON of executor is not propagated from worker, it is from driver.
On Tue, Mar 8, 2016 at 1:04 AM, Kostas Chalikias <[email protected]> wrote: > All - would appreciate some insight regarding how to set PYSPARK_PYTHON > correctly. > > I have created a virtual environment in the same place for all 3 of my > cluster hosts, 2 of them running slaves and one running a master. I also > run an RPC server on the master host to allow users from the office (the > cluster is hosted elsewhere) to send the work. > > For the master and slaves, I created $SPARK_HOME/conf/spark-env.sh and set > PYSPARK_PYTHON to the executable of my virtualenv. I made spark-env.sh > executable by all as suggested by the docs even though it appears that it > is sourced to be safe. > > I then started the cluster using start-master.sh and start-slave.sh > accordingly and inspected the environment variables of each process under > /proc/pid to confirm PYSPARK_PYTHON was set correctly, which it was. I then > sent the first bunch of work only to get exceptions logged in the driver > program (the RPC server) from the slaves being unable to import my modules > upon unpickling data. > > After several hours of reading docs and pulling out hair, I tried setting > PYSPARK_PYTHON into the environment in the code of the RPC server / driver > program as follows, based on this mailing list query: > > > https://mail-archives.apache.org/mod_mbox/spark-user/201403.mbox/%3CCAG-p0g2L=z9H1H4ZY1XdLOGnGyPEKqi8+=tpieqvdwtvwwa...@mail.gmail.com%3E > > os.environ['PYSPARK_PYTHON'] = '/path/to/virtualenv/bin/python' > > and to my surprise that worked. I don't understand why that makes sense as > I can't find any mention of the environment of the driver program > overriding the environment in the workers, also that environment variable > was previously completely unset in the driver program anyway. > > Is there an explanation for this to help me understand how to do things > properly? We run Spark 1.6.0 on Ubuntu 14.04. > > Thanks > > Kostas > -- Best Regards Jeff Zhang
