Hi, also building entire environments in containers may increase their sizes massively.
Regards, Gourav Sengupta On Sat, Dec 4, 2021 at 7:52 AM Bode, Meikel, NMA-CFD < meikel.b...@bertelsmann.de> wrote: > Hi Mich, > > > > sure thats possible. But distributing the complete env would be more > practical. > > A workaround at the moment is, that we build different environments and > store them in a pv and then we mount it into the pods and refer from the > SparkApplication resource to the desired env.. > > > > But actually these options exist and I want to understand what the issue > is… > > Any hints on that? > > > > Best, > > Meikel > > > > *From:* Mich Talebzadeh <mich.talebza...@gmail.com> > *Sent:* Freitag, 3. Dezember 2021 13:27 > *To:* Bode, Meikel, NMA-CFD <meikel.b...@bertelsmann.de> > *Cc:* dev <d...@spark.apache.org>; user@spark.apache.org > *Subject:* Re: Conda Python Env in K8S > > > > Build python packages into the docker image itself first with pip install > > > > RUN pip install panda . . —no-cache > > > > HTH > > > > On Fri, 3 Dec 2021 at 11:58, Bode, Meikel, NMA-CFD < > meikel.b...@bertelsmann.de> wrote: > > Hello, > > > > I am trying to run spark jobs using Spark Kubernetes Operator. > > But when I try to bundle a conda python environment using the following > resource description the python interpreter is only unpack to the driver > and not to the executors. > > > > apiVersion: "sparkoperator.k8s.io/v1beta2 > <https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fsparkoperator.k8s.io%2Fv1beta2&data=04%7C01%7CMeikel.Bode%40bertelsmann.de%7Cb5110ae39caf431d2dbb08d9b65ac233%7C1ca8bd943c974fc68955bad266b43f0b%7C0%7C0%7C637741323186317880%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=yMQzpK2mKyoEThxxOakqJJmV7JbbX14nW4w46pZk3KQ%3D&reserved=0> > " > > kind: SparkApplication > > metadata: > > name: … > > spec: > > type: Python > > pythonVersion: "3" > > mode: cluster > > mainApplicationFile: local:///path/script.py > > .. > > sparkConf: > > "spark.archives": "local:///path/conda-env.tar.gz#environment" > > "spark.pyspark.python": "./environment/bin/python" > > "spark.pyspark.driver.python": "./environment/bin/python" > > > > > > The driver is unpacking the archive and the python scripts gets executed. > > On executors there is no log message indicating that the archive gets > unpacked. > > Executors then fail as they cant find the python executable at the given > location "./environment/bin/python". > > > > Any hint? > > > > Best, > > Meikel > > -- > > > > > > view my Linkedin profile > <https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fin%2Fmich-talebzadeh-ph-d-5205b2%2F&data=04%7C01%7CMeikel.Bode%40bertelsmann.de%7Cb5110ae39caf431d2dbb08d9b65ac233%7C1ca8bd943c974fc68955bad266b43f0b%7C0%7C0%7C637741323186327824%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=KyGbLLVXpJxSumXg7GHnYIYiP2J7q%2Fe4UJJWJefjAnI%3D&reserved=0> > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > >