Hi Riccardo This is the env variables at runtime
PYTHONUNBUFFERED=1;*PYTHONPATH=* C:\Users\admin\PycharmProjects\packages\;C:\Users\admin\PycharmProjects\pythonProject2\DS\;C:\Users\admin\PycharmProjects\pythonProject2\DS\conf\;C:\Users\admin\PycharmProjects\pythonProject2\DS\lib\;C:\Users\admin\PycharmProjects\pythonProject2\DS\src This is the configuration set up for analyze_house_prices_GCP [image: image.png] So like in Linux, I created a windows env variable and on PyCharm terminal, I can see it (venv) C:\Users\admin\PycharmProjects\pythonProject2\DS\src>*echo %PYTHONPATH%* PYTHONPATH=C:\Users\admin\PycharmProjects\packages\;C:\Users\admin\PycharmProjects\pythonProject2\DS\;C:\Users\admin\PycharmProjects\pythonProject2\DS\conf\ ;C:\Users\admin\PycharmProjects\pythonProject2\DS\lib\;C:\Users\admin\PycharmProjects\pythonProject2\DS\src It picks up sparkstuff.py (venv) C:\Users\admin\PycharmProjects\pythonProject2\DS\src>*where sparkstuff.py* C:\Users\admin\PycharmProjects\packages\sparkutils\sparkstuff.py But in spark-submit within the code it does not (venv) C:\Users\admin\PycharmProjects\pythonProject2\DS\src>spark-submit --jars ..\spark-bigquery-with-dependencies_2.12-0.18.0.jar analyze_house_prices_GCP .py Traceback (most recent call last): File "C:/Users/admin/PycharmProjects/pythonProject2/DS/src/analyze_house_prices_GCP.py", line 8, in <module> import sparkstuff as s ModuleNotFoundError: No module named 'sparkutils' thanks *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Fri, 8 Jan 2021 at 16:38, Riccardo Ferrari <ferra...@gmail.com> wrote: > I think spark checks the python path env variable. Need to provide that. > Of course that works in local mode only > > On Fri, Jan 8, 2021, 5:28 PM Sean Owen <sro...@gmail.com> wrote: > >> I don't see anywhere that you provide 'sparkstuff'? how would the Spark >> app have this code otherwise? >> >> On Fri, Jan 8, 2021 at 10:20 AM Mich Talebzadeh < >> mich.talebza...@gmail.com> wrote: >> >>> Thanks Riccardo. >>> >>> I am well aware of the submission form >>> >>> However, my question relates to doing submission within PyCharm itself. >>> >>> This is what I do at Pycharm *terminal* to invoke the module python >>> >>> spark-submit --jars >>> ..\lib\spark-bigquery-with-dependencies_2.12-0.18.0.jar \ >>> --packages com.github.samelamin:spark-bigquery_2.11:0.2.6 >>> analyze_house_prices_GCP.py >>> >>> However, at terminal run it does not pickup import dependencies in the >>> code! >>> >>> Traceback (most recent call last): >>> File >>> "C:/Users/admin/PycharmProjects/pythonProject2/DS/src/analyze_house_prices_GCP.py", >>> line 8, in <module> >>> import sparkstuff as s >>> ModuleNotFoundError: No module named 'sparkstuff' >>> >>> The python code is attached, pretty simple >>> >>> Thanks >>> >>> >>> >>>