Hi Evan,

Actually did hit similar problem few months ago and finally managed to solve it. My situation is a bit different as I am using Python SDK and DataFlow runner v2 (+streaming engine, prime), and quite a lot of state-full processing. And for my case it did fail with very similar msg but related to some state-full step.

The thing is I discovered that update pipeline in place does fail even when submitting exact same code to the pipeline. it seems the problem was that the pipeline graph must be parsed in same order that on the original graph. In my case I had an unordered set of steps to add them to pipeline resulting in the same pipeline graph, but it seems that the ordering of parsing does matter and it fails to update running job if order is different.

For my case I just sorted the steps to be added to pipeline by name and updating job on fly started working. So it seems that pipeline state on DataFlow depends somehow on the order in which steps are added to pipeline since some recent versions (as I do recall this was working correctly ~2.50?). Anyone knows if this is intended? If yes would like to know some explanation.

Best regards

Wiśniowski Piotr

On 15.12.2023 00:14, Evan Galpin wrote:
The pipeline in question is using Dataflow v1 Runner (Runner v2: Disabled) in case that's an important detail.

On Tue, Dec 12, 2023 at 4:22 PM Evan Galpin <egal...@apache.org> wrote:

    I've checked the source code and deployment command for cases of
    setting experiments. I don't see "enable_custom_pubsub_source"
    being used at all, no.  I also confirmed that it is not active on
    the existing/running job.

    On Tue, Dec 12, 2023 at 4:11 PM Reuven Lax via user
    <u...@beam.apache.org> wrote:

        Are you setting the enable_custom_pubsub_source experiment by
        any chance?

        On Tue, Dec 12, 2023 at 3:24 PM Evan Galpin
        <egal...@apache.org> wrote:

            Hi all,

            When attempting to upgrade a running Dataflow pipeline
            from SDK 2.51.0 to 2.52.0, an incompatibility warning is
            surfaced that prevents pipeline upgrade:

                The Coder or type for step .../PubsubUnboundedSource
                has changed


            Was there an intentional coder change introduced for
            PubsubMessage in 2.52.0?  I didn't note anything in the
            release notes https://beam.apache.org/blog/beam-2.52.0/
            nor recent changes in
            PubsubMessageWithAttributesCoder[1].  Specifically the
            step uses `PubsubMessageWithAttributesCoder` via
            `PubsubIO.readMessagesWithAttributes()`

            Thanks!


            [1]
            
https://github.com/apache/beam/blob/90e79ae373ab38cf4e48e9854c28aaffb0938458/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubMessageWithAttributesCoder.java#L36

Reply via email to