Many thanks Hu, worked like a charm few qq so in my reqs.txt i should put all beam requirements PLUS my own?
and in the setup.py, shall i just declare "apache-beam[gcp]==2.54.0", # Must match the version in `Dockerfile``. thanks and kind regards Marco On Wed, Jun 12, 2024 at 1:48 PM XQ Hu <[email protected]> wrote: > Any reason to use this? > > RUN pip install avro-python3 pyarrow==0.15.1 apache-beam[gcp]==2.30.0 > pandas-datareader==0.9.0 > > It is typically recommended to use the latest Beam and build the docker > image using the requirements released for each Beam, for example, > https://github.com/apache/beam/blob/release-2.56.0/sdks/python/container/py311/base_image_requirements.txt > > On Wed, Jun 12, 2024 at 1:31 AM Sofia’s World <[email protected]> wrote: > >> Sure, apologies, it crossed my mind it would have been useful to refert >> to it >> >> so this is the docker file >> >> >> https://github.com/mmistroni/GCP_Experiments/edit/master/dataflow/shareloader/Dockerfile_tester >> >> I was using a setup.py as well, but then i commented out the usage in the >> dockerfile after checking some flex templates which said it is not needed >> >> >> https://github.com/mmistroni/GCP_Experiments/blob/master/dataflow/shareloader/setup_dftester.py >> >> thanks in advance >> Marco >> >> >> >> >> >> >> >> On Tue, Jun 11, 2024 at 10:54 PM XQ Hu <[email protected]> wrote: >> >>> Can you share your Dockerfile? >>> >>> On Tue, Jun 11, 2024 at 4:43 PM Sofia’s World <[email protected]> >>> wrote: >>> >>>> thanks all, it seemed to work but now i am getting a different >>>> problem, having issues in building pyarrow... >>>> >>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>>> <string>:36: DeprecationWarning: pkg_resources is deprecated as an API. >>>> See https://setuptools.pypa.io/en/latest/pkg_resources.html >>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>>> WARNING setuptools_scm.pyproject_reading toml section missing >>>> 'pyproject.toml does not contain a tool.setuptools_scm section' >>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>>> Traceback (most recent call last): >>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>>> File >>>> "/tmp/pip-build-env-meihcxsp/overlay/lib/python3.11/site-packages/setuptools_scm/_integration/pyproject_reading.py", >>>> line 36, in read_pyproject >>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>>> section = defn.get("tool", {})[tool_name] >>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>>> ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^ >>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>>> KeyError: 'setuptools_scm' >>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>>> running bdist_wheel >>>> >>>> >>>> >>>> >>>> It is somehow getting messed up with a toml ? >>>> >>>> >>>> Could anyone advise? >>>> >>>> thanks >>>> >>>> Marco >>>> >>>> >>>> >>>> >>>> >>>> On Tue, Jun 11, 2024 at 1:00 AM XQ Hu via user <[email protected]> >>>> wrote: >>>> >>>>> >>>>> https://github.com/GoogleCloudPlatform/python-docs-samples/tree/main/dataflow/flex-templates/pipeline_with_dependencies >>>>> is a great example. >>>>> >>>>> On Mon, Jun 10, 2024 at 4:28 PM Valentyn Tymofieiev via user < >>>>> [email protected]> wrote: >>>>> >>>>>> In this case the Python version will be defined by the Python version >>>>>> installed in the docker image of your flex template. So, you'd have to >>>>>> build your flex template from a base image with Python 3.11. >>>>>> >>>>>> On Mon, Jun 10, 2024 at 12:50 PM Sofia’s World <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hello >>>>>>> no i am running my pipelien on GCP directly via a flex template, >>>>>>> configured using a Docker file >>>>>>> Any chances to do something in the Dockerfile to force the version >>>>>>> at runtime? >>>>>>> Thanks >>>>>>> >>>>>>> On Mon, Jun 10, 2024 at 7:24 PM Anand Inguva via user < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Hello, >>>>>>>> >>>>>>>> Are you running your pipeline from the python 3.11 environment? If >>>>>>>> you are running from a python 3.11 environment and don't use a custom >>>>>>>> docker container image, DataflowRunner(Assuming Apache Beam on GCP >>>>>>>> means >>>>>>>> Apache Beam on DataflowRunner), will use Python 3.11. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Anand >>>>>>>> >>>>>>>
