I have figured the problem.
- With the refactoring above, PYTHONPATH is set only in interpreter.sh and
no longer set in PySparkInterpreter
- I use sudo for impersonation
- By default, sudo does not preserve environment, so I use "sudo -E"
- But it *still* explicitly drops PYTHONPATH, as described in
http://kmiku7.github.io/2019/09/20/Keep-environment-variables-with-sudo-command/

So, one either should reconfigure sudo, or do "sudo
PYTHONPATH=${PYTHONPATH} ....".


On Thu, Mar 11, 2021 at 7:27 PM Vladimir Prus <vladimir.p...@gmail.com>
wrote:

> Hi,
>
> we've upgraded from 0.8 to 0.9 and I observe that with the same
> interpreter settings,
> PySpark no longer works with:
>
>     java.io.IOException: Fail to launch python process.
>
>     Traceback (most recent call last):
>
>       File "/tmp/1615477929423-0/zeppelin_python.py", line 20, in <module>
>
>         from py4j.java_gateway import java_import, JavaGateway,
> GatewayClient
>
>     ModuleNotFoundError: No module named 'py4j'
>
> Comparing logs, I see that for 0.8:
>
>  INFO [2021-03-11 15:40:13,565] ({pool-3-thread-5}
> PySparkInterpreter.java[createGatewayServerAndStartScript]:265) â
> <U+0080><U+0094> pythonExec:
> /mnt/conda/envs/zeppelin-pyspark-python3/bin/python
>
>  INFO [2021-03-11 15:40:13,585] ({pool-3-thread-5}
> PySparkInterpreter.java[setupPySparkEnv]:236) â<U+0080><U+0094> PYTHONPATH:
> /usr/lib/spark/python/lib/pyspark.zip:/usr/lib/spark/python/lib/py4j-0.10.7-src.zip:/mnt/zeppelin-0.8.3-SNAPSHOT/../interpreter/lib/python
>
> Whereas 0.9 logs say:
>
>  INFO [2021-03-11 15:52:09,428]
> ({FIFOScheduler-interpreter_293940413-Worker-1}
> PythonInterpreter.java[setupPythonEnv]:212) - PYTHONPATH:
> /tmp/1615477929423-0
>
>  INFO [2021-03-11 15:52:09,428]
> ({FIFOScheduler-interpreter_293940413-Worker-1}
> PythonInterpreter.java[createGatewayServerAndStartScript]:147) - Launching
> Python Process Command: /mnt/conda/envs/zeppelin-pyspark-python3/bin/python
> /tmp/1615477929423-0/zeppelin_python.py 10.4.2.199 37753
>
>
> In other words, looks like 0.9 does not add pyspark zips to PYTHONPATH.
> Looking at the history, I see a major refactoring in this area:
>
>
> https://github.com/apache/zeppelin/commit/0a97446a70f6294a3efb071bb9a70601f885840b
>
> But can't quite understand whether this change in behavour is intentional,
> and what additional options I might need to set. Does anybody have any
> suggestions?
>
> --
> Vladimir Prus
> http://vladimirprus.com
>


-- 
Vladimir Prus
http://vladimirprus.com

Reply via email to