Hi everyone,

First time writing the email list, so please tell me if I'm doing this all
wrong.

I'm building a streaming pipeline to be run on the DataflowRunner that
reads from PubSub and writes to BQ using the Python 3 SDK.

I can get the pipeline started fine with the DirectRunner, but as soon as I
try to deploy to DataFlow it throws the following error:

File "/usr/local/lib/python3.7/site-packages/apache_beam/coders/coders.py",
line 221, in to_type_hint
    raise NotImplementedError('BEAM-2717')

I've tried narrowing down what exactly could be causing the issue and it
appears to be caused by the second step in my pipeline, which transforms
the bytes read from PubSub to my own internal Proto format:

def parse_message_blobs(x: bytes) -> ActionWrapper:
action_wrapper = ActionWrapper()
action_wrapper.ParseFromString(x)

return action_wrapper

which is applied as a Map step.

I've added typehints to all downstream steps as follows:
def partition_by_environment(
x: ActionWrapper, num_partitions: int, environments: List[str]
) -> int:

I'd really appreciate it if anyone could let me know what I'm doing wrong,
or what exactly is the issue this error is referring to. I read the JIRA
ticket, but did not understand how it is related to my issue here.

Thanks!
Kind regards,
Lien

Reply via email to