Re: Running pyspark job from virtual environment

2021-01-17 Thread Mich Talebzadeh
Well. When you or application log in to Linux host (whether a physical tin box or a virtual node), they execute a script called .bashrc at home directory. If it is a scheduled job then it will also execute the same as well. In my Google Data proc cluster of three (one master and two workers), in

Re: Running pyspark job from virtual environment

2021-01-17 Thread rajat kumar
Hi Mich, Thanks for response. I am running it through CLI (on the cluster). Since this will be scheduled job. I do not want to activate the environment manually. It should automatically take the path of virtual environment to run the job. For that I saw 3 properties which I mentioned. I think se

Re: Running pyspark job from virtual environment

2021-01-17 Thread Mich Talebzadeh
Hi Rajat, Are you running this through an IDE like PyCharm or on CLI? If you already have a Python Virtual environment, then just activate it The only env variable you need to set is export PYTHONPATH that you can do it in your startup shell script .bashrc etc. Once you are in virtual environme

Re: Running pyspark job from virtual environment

2021-01-17 Thread rajat kumar
Hello, Can anyone confirm here please? Regards Rajat On Sat, Jan 16, 2021 at 11:46 PM rajat kumar wrote: > Hey Users, > > I want to run spark job from virtual environment using Python. > > Please note I am creating virtual env (using python3 -m venv env) > > I see that there are 3 variables fo

Running pyspark job from virtual environment

2021-01-16 Thread rajat kumar
Hey Users, I want to run spark job from virtual environment using Python. Please note I am creating virtual env (using python3 -m venv env) I see that there are 3 variables for PYTHON which we have to set: PYTHONPATH PYSPARK_DRIVER_PYTHON PYSPARK_PYTHON I have 2 doubts: 1. If i want to use Virt