Your pipeline launcher refers to a package named 'modules', but this
package is not available in the runtime environment.

On Sat, Jun 15, 2024 at 11:17 AM Sofia’s World <[email protected]> wrote:

> Sorry, i cheered up too early
> i can successfully build the image however, at runtime the code fails
> always with this exception and i cannot figure out why
>
> i mimicked the sample directory structure
>
>
> ---- mypackage
>    --- __init__,py
>        dftester.py
>        obb_utils.py
>
> dataflow_tester_main.py
>
> this is the content of my dataflow_tester_main.py
>
> from mypackage import dftester
> import logging
> if __name__ == '__main__':
>   logging.getLogger().setLevel(logging.INFO)
>   dftester.run()
>
>
> and this is my dockerfile
>
>
> https://github.com/mmistroni/GCP_Experiments/blob/master/dataflow/shareloader/Dockerfile_tester
>
> and at the bottom if this email my exception
> I am puzzled on where the error is coming from as i have almost copied
> this sample
> https://github.com/GoogleCloudPlatform/python-docs-samples/blob/main/dataflow/flex-templates/pipeline_with_dependencies/main.py
>
> thanks and regards
>  Marco
>
>
>
>
>
>
>
>
>
>
>
> Traceback (most recent call last): File
> "/usr/local/lib/python3.11/site-packages/apache_beam/runners/worker/sdk_worker_main.py",
> line 115, in create_harness _load_main_session(semi_persistent_directory)
> File
> "/usr/local/lib/python3.11/site-packages/apache_beam/runners/worker/sdk_worker_main.py",
> line 354, in _load_main_session pickler.load_session(session_file) File
> "/usr/local/lib/python3.11/site-packages/apache_beam/internal/pickler.py",
> line 65, in load_session return desired_pickle_lib.load_session(file_path)
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File
> "/usr/local/lib/python3.11/site-packages/apache_beam/internal/dill_pickler.py",
> line 446, in load_session return dill.load_session(file_path)
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File
> "/usr/local/lib/python3.11/site-packages/dill/_dill.py", line 368, in
> load_session module = unpickler.load() ^^^^^^^^^^^^^^^^ File
> "/usr/local/lib/python3.11/site-packages/dill/_dill.py", line 472, in load
> obj = StockUnpickler.load(self) ^^^^^^^^^^^^^^^^^^^^^^^^^ File
> "/usr/local/lib/python3.11/site-packages/dill/_dill.py", line 462, in
> find_class return StockUnpickler.find_class(self, module, name)
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ModuleNotFoundError: No
> module named 'modules'
>
>
>
>
>
>
>
> On Fri, Jun 14, 2024 at 5:52 AM Sofia’s World <[email protected]> wrote:
>
>> Many thanks Hu, worked like a charm
>>
>> few qq
>> so in my reqs.txt i should put all beam requirements PLUS my own?
>>
>> and in the setup.py, shall i just declare
>>
>> "apache-beam[gcp]==2.54.0",  # Must match the version in `Dockerfile``.
>>
>> thanks and kind regards
>> Marco
>>
>>
>>
>>
>>
>>
>> On Wed, Jun 12, 2024 at 1:48 PM XQ Hu <[email protected]> wrote:
>>
>>> Any reason to use this?
>>>
>>> RUN pip install avro-python3 pyarrow==0.15.1 apache-beam[gcp]==2.30.0
>>>  pandas-datareader==0.9.0
>>>
>>> It is typically recommended to use the latest Beam and build the docker
>>> image using the requirements released for each Beam, for example,
>>> https://github.com/apache/beam/blob/release-2.56.0/sdks/python/container/py311/base_image_requirements.txt
>>>
>>> On Wed, Jun 12, 2024 at 1:31 AM Sofia’s World <[email protected]>
>>> wrote:
>>>
>>>> Sure, apologies, it crossed my mind it would have been useful to refert
>>>> to it
>>>>
>>>> so this is the docker file
>>>>
>>>>
>>>> https://github.com/mmistroni/GCP_Experiments/edit/master/dataflow/shareloader/Dockerfile_tester
>>>>
>>>> I was using a setup.py as well, but then i commented out the usage in
>>>> the dockerfile after checking some flex templates which said it is not
>>>> needed
>>>>
>>>>
>>>> https://github.com/mmistroni/GCP_Experiments/blob/master/dataflow/shareloader/setup_dftester.py
>>>>
>>>> thanks in advance
>>>>  Marco
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Jun 11, 2024 at 10:54 PM XQ Hu <[email protected]> wrote:
>>>>
>>>>> Can you share your Dockerfile?
>>>>>
>>>>> On Tue, Jun 11, 2024 at 4:43 PM Sofia’s World <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> thanks all,  it seemed to work but now i am getting a different
>>>>>> problem, having issues in building pyarrow...
>>>>>>
>>>>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image":      
>>>>>>  <string>:36: DeprecationWarning: pkg_resources is deprecated as an API. 
>>>>>> See https://setuptools.pypa.io/en/latest/pkg_resources.html
>>>>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image":      
>>>>>>  WARNING setuptools_scm.pyproject_reading toml section missing 
>>>>>> 'pyproject.toml does not contain a tool.setuptools_scm section'
>>>>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image":      
>>>>>>  Traceback (most recent call last):
>>>>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image":      
>>>>>>    File 
>>>>>> "/tmp/pip-build-env-meihcxsp/overlay/lib/python3.11/site-packages/setuptools_scm/_integration/pyproject_reading.py",
>>>>>>  line 36, in read_pyproject
>>>>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image":      
>>>>>>      section = defn.get("tool", {})[tool_name]
>>>>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image":      
>>>>>>                ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^
>>>>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image":      
>>>>>>  KeyError: 'setuptools_scm'
>>>>>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image":      
>>>>>>  running bdist_wheel
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> It is somehow getting messed up with a toml ?
>>>>>>
>>>>>>
>>>>>> Could anyone advise?
>>>>>>
>>>>>> thanks
>>>>>>
>>>>>>  Marco
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Jun 11, 2024 at 1:00 AM XQ Hu via user <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>> https://github.com/GoogleCloudPlatform/python-docs-samples/tree/main/dataflow/flex-templates/pipeline_with_dependencies
>>>>>>> is a great example.
>>>>>>>
>>>>>>> On Mon, Jun 10, 2024 at 4:28 PM Valentyn Tymofieiev via user <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> In this case the Python version will be defined by the Python
>>>>>>>> version installed in the docker image of your flex template. So, you'd
>>>>>>>> have to build your flex template from a base image with Python 3.11.
>>>>>>>>
>>>>>>>> On Mon, Jun 10, 2024 at 12:50 PM Sofia’s World <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hello
>>>>>>>>>  no i am running my pipelien on  GCP directly via a flex template,
>>>>>>>>> configured using a Docker file
>>>>>>>>> Any chances to do something in the Dockerfile to force the version
>>>>>>>>> at runtime?
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>> On Mon, Jun 10, 2024 at 7:24 PM Anand Inguva via user <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> Are you running your pipeline from the python 3.11 environment?
>>>>>>>>>> If you are running from a python 3.11 environment and don't use a 
>>>>>>>>>> custom
>>>>>>>>>> docker container image, DataflowRunner(Assuming Apache Beam on GCP 
>>>>>>>>>> means
>>>>>>>>>> Apache Beam on DataflowRunner), will use Python 3.11.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Anand
>>>>>>>>>>
>>>>>>>>>

Reply via email to