Your pipeline launcher refers to a package named 'modules', but this package is not available in the runtime environment.
On Sat, Jun 15, 2024 at 11:17 AM Sofia’s World <[email protected]> wrote: > Sorry, i cheered up too early > i can successfully build the image however, at runtime the code fails > always with this exception and i cannot figure out why > > i mimicked the sample directory structure > > > ---- mypackage > --- __init__,py > dftester.py > obb_utils.py > > dataflow_tester_main.py > > this is the content of my dataflow_tester_main.py > > from mypackage import dftester > import logging > if __name__ == '__main__': > logging.getLogger().setLevel(logging.INFO) > dftester.run() > > > and this is my dockerfile > > > https://github.com/mmistroni/GCP_Experiments/blob/master/dataflow/shareloader/Dockerfile_tester > > and at the bottom if this email my exception > I am puzzled on where the error is coming from as i have almost copied > this sample > https://github.com/GoogleCloudPlatform/python-docs-samples/blob/main/dataflow/flex-templates/pipeline_with_dependencies/main.py > > thanks and regards > Marco > > > > > > > > > > > > Traceback (most recent call last): File > "/usr/local/lib/python3.11/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 115, in create_harness _load_main_session(semi_persistent_directory) > File > "/usr/local/lib/python3.11/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 354, in _load_main_session pickler.load_session(session_file) File > "/usr/local/lib/python3.11/site-packages/apache_beam/internal/pickler.py", > line 65, in load_session return desired_pickle_lib.load_session(file_path) > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File > "/usr/local/lib/python3.11/site-packages/apache_beam/internal/dill_pickler.py", > line 446, in load_session return dill.load_session(file_path) > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File > "/usr/local/lib/python3.11/site-packages/dill/_dill.py", line 368, in > load_session module = unpickler.load() ^^^^^^^^^^^^^^^^ File > "/usr/local/lib/python3.11/site-packages/dill/_dill.py", line 472, in load > obj = StockUnpickler.load(self) ^^^^^^^^^^^^^^^^^^^^^^^^^ File > "/usr/local/lib/python3.11/site-packages/dill/_dill.py", line 462, in > find_class return StockUnpickler.find_class(self, module, name) > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ModuleNotFoundError: No > module named 'modules' > > > > > > > > On Fri, Jun 14, 2024 at 5:52 AM Sofia’s World <[email protected]> wrote: > >> Many thanks Hu, worked like a charm >> >> few qq >> so in my reqs.txt i should put all beam requirements PLUS my own? >> >> and in the setup.py, shall i just declare >> >> "apache-beam[gcp]==2.54.0", # Must match the version in `Dockerfile``. >> >> thanks and kind regards >> Marco >> >> >> >> >> >> >> On Wed, Jun 12, 2024 at 1:48 PM XQ Hu <[email protected]> wrote: >> >>> Any reason to use this? >>> >>> RUN pip install avro-python3 pyarrow==0.15.1 apache-beam[gcp]==2.30.0 >>> pandas-datareader==0.9.0 >>> >>> It is typically recommended to use the latest Beam and build the docker >>> image using the requirements released for each Beam, for example, >>> https://github.com/apache/beam/blob/release-2.56.0/sdks/python/container/py311/base_image_requirements.txt >>> >>> On Wed, Jun 12, 2024 at 1:31 AM Sofia’s World <[email protected]> >>> wrote: >>> >>>> Sure, apologies, it crossed my mind it would have been useful to refert >>>> to it >>>> >>>> so this is the docker file >>>> >>>> >>>> https://github.com/mmistroni/GCP_Experiments/edit/master/dataflow/shareloader/Dockerfile_tester >>>> >>>> I was using a setup.py as well, but then i commented out the usage in >>>> the dockerfile after checking some flex templates which said it is not >>>> needed >>>> >>>> >>>> https://github.com/mmistroni/GCP_Experiments/blob/master/dataflow/shareloader/setup_dftester.py >>>> >>>> thanks in advance >>>> Marco >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Tue, Jun 11, 2024 at 10:54 PM XQ Hu <[email protected]> wrote: >>>> >>>>> Can you share your Dockerfile? >>>>> >>>>> On Tue, Jun 11, 2024 at 4:43 PM Sofia’s World <[email protected]> >>>>> wrote: >>>>> >>>>>> thanks all, it seemed to work but now i am getting a different >>>>>> problem, having issues in building pyarrow... >>>>>> >>>>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>>>>> <string>:36: DeprecationWarning: pkg_resources is deprecated as an API. >>>>>> See https://setuptools.pypa.io/en/latest/pkg_resources.html >>>>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>>>>> WARNING setuptools_scm.pyproject_reading toml section missing >>>>>> 'pyproject.toml does not contain a tool.setuptools_scm section' >>>>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>>>>> Traceback (most recent call last): >>>>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>>>>> File >>>>>> "/tmp/pip-build-env-meihcxsp/overlay/lib/python3.11/site-packages/setuptools_scm/_integration/pyproject_reading.py", >>>>>> line 36, in read_pyproject >>>>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>>>>> section = defn.get("tool", {})[tool_name] >>>>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>>>>> ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^ >>>>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>>>>> KeyError: 'setuptools_scm' >>>>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>>>>> running bdist_wheel >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> It is somehow getting messed up with a toml ? >>>>>> >>>>>> >>>>>> Could anyone advise? >>>>>> >>>>>> thanks >>>>>> >>>>>> Marco >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Jun 11, 2024 at 1:00 AM XQ Hu via user <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> https://github.com/GoogleCloudPlatform/python-docs-samples/tree/main/dataflow/flex-templates/pipeline_with_dependencies >>>>>>> is a great example. >>>>>>> >>>>>>> On Mon, Jun 10, 2024 at 4:28 PM Valentyn Tymofieiev via user < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> In this case the Python version will be defined by the Python >>>>>>>> version installed in the docker image of your flex template. So, you'd >>>>>>>> have to build your flex template from a base image with Python 3.11. >>>>>>>> >>>>>>>> On Mon, Jun 10, 2024 at 12:50 PM Sofia’s World <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hello >>>>>>>>> no i am running my pipelien on GCP directly via a flex template, >>>>>>>>> configured using a Docker file >>>>>>>>> Any chances to do something in the Dockerfile to force the version >>>>>>>>> at runtime? >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> On Mon, Jun 10, 2024 at 7:24 PM Anand Inguva via user < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Hello, >>>>>>>>>> >>>>>>>>>> Are you running your pipeline from the python 3.11 environment? >>>>>>>>>> If you are running from a python 3.11 environment and don't use a >>>>>>>>>> custom >>>>>>>>>> docker container image, DataflowRunner(Assuming Apache Beam on GCP >>>>>>>>>> means >>>>>>>>>> Apache Beam on DataflowRunner), will use Python 3.11. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Anand >>>>>>>>>> >>>>>>>>>
