Error handling for GCP Pub/Sub on Dataflow using Python

Nimrod Shory Fri, 24 May 2024 00:46:12 -0700

Hello group,
I am pretty new to Dataflow and Beam.
I have deployed a Dataflow streaming job using Beam with Python.
The final step of my pipeline is publishing a message to Pub/Sub.
In certain cases the message can become too big for Pub/Sub (larger than
the allowed 10MB) and in that case of failure, it just retries to publish
indefinitely, causing the Job to eventually stop processing new data.


My question is, is there a way to handle failures in beam.io.WriteToPubSub
or should I implement a similar method myself?

Ideally, I would like to write the too large messages to a file on cloud
storage.

Any ideas will be appreciated.

Thanks in advance for your help!

Error handling for GCP Pub/Sub on Dataflow using Python

Reply via email to