I imagine this is no longer helpful to you, Evan, but I ran into this issue
this week and tracked down the underlying problem. Basically, a snappy-java
version upgrade [1] seems to have changed how the SnappyCoder [2] is
serialized. Since this is used by PubSub read, it caused upgrades to start
failing.

I don't think there's much to be done about this at this point since this
is from 2.52.0, but I added an update to CHANGES.md to call this out and
figured I'd also respond here in case others hit this in the future. I
think the only real workaround is probably draining/restarting any impacted
pipelines.

[1] https://github.com/apache/beam/pull/28655
[2]
https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/SnappyCoder.java
[3] https://github.com/apache/beam/pull/32753

Thanks,
Danny

On Wed, Jan 3, 2024 at 3:14 PM Wiśniowski Piotr <
contact.wisniowskipi...@gmail.com> wrote:

> Hi Evan,
>
> Actually did hit similar problem few months ago and finally managed to
> solve it. My situation is a bit different as I am using Python SDK and
> DataFlow runner v2 (+streaming engine, prime), and quite a lot
> of state-full processing. And for my case it did fail with very similar msg
> but related to some state-full step.
>
> The thing is I discovered that update pipeline in place does fail even
> when submitting exact same code to the pipeline. it seems the problem was
> that the pipeline graph must be parsed in same order that on the original
> graph. In my case I had an unordered set of steps to add them to pipeline
> resulting in the same pipeline graph, but it seems that the ordering of
> parsing does matter and it fails to update running job if order is
> different.
>
> For my case I just sorted the steps to be added to pipeline by name and
> updating job on fly started working. So it seems that pipeline state on
> DataFlow depends somehow on the order in which steps are added to pipeline
> since some recent versions (as I do recall this was working correctly
> ~2.50?). Anyone knows if this is intended? If yes would like to know some
> explanation.
>
> Best regards
>
> Wiśniowski Piotr
> On 15.12.2023 00:14, Evan Galpin wrote:
>
> The pipeline in question is using Dataflow v1 Runner (Runner v2: Disabled)
> in case that's an important detail.
>
> On Tue, Dec 12, 2023 at 4:22 PM Evan Galpin <egal...@apache.org> wrote:
>
>> I've checked the source code and deployment command for cases of setting
>> experiments. I don't see "enable_custom_pubsub_source" being used at all,
>> no.  I also confirmed that it is not active on the existing/running job.
>>
>> On Tue, Dec 12, 2023 at 4:11 PM Reuven Lax via user <u...@beam.apache.org>
>> wrote:
>>
>>> Are you setting the enable_custom_pubsub_source experiment by any
>>> chance?
>>>
>>> On Tue, Dec 12, 2023 at 3:24 PM Evan Galpin <egal...@apache.org> wrote:
>>>
>>>> Hi all,
>>>>
>>>> When attempting to upgrade a running Dataflow pipeline from SDK 2.51.0
>>>> to 2.52.0, an incompatibility warning is surfaced that prevents pipeline
>>>> upgrade:
>>>>
>>>>
>>>>> The Coder or type for step .../PubsubUnboundedSource has changed
>>>>
>>>>
>>>> Was there an intentional coder change introduced for PubsubMessage in
>>>> 2.52.0?  I didn't note anything in the release notes
>>>> https://beam.apache.org/blog/beam-2.52.0/ nor recent changes in
>>>> PubsubMessageWithAttributesCoder[1].  Specifically the step uses
>>>> `PubsubMessageWithAttributesCoder` via
>>>> `PubsubIO.readMessagesWithAttributes()`
>>>>
>>>> Thanks!
>>>>
>>>>
>>>> [1]
>>>> https://github.com/apache/beam/blob/90e79ae373ab38cf4e48e9854c28aaffb0938458/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubMessageWithAttributesCoder.java#L36
>>>>
>>>

Reply via email to