Re: [Proposal] Change to Default PubsubMessage Coder

Evan Galpin Mon, 19 Dec 2022 15:37:19 -0800

Bump 🙃

Any other risks or drawbacks associated with altering the default coder for
PubsubMessage to be the most inclusive coder with respect to possible
fields?


Thanks,
Evan


On Mon, Dec 12, 2022 at 10:06 AM Evan Galpin <[email protected]> wrote:

> Hi folks,
>
> I'd like to solicit feedback on the notion of using
> PubsubMessageWithAttributesAndMessageIdAndOrderingKeyCoder[1] as the
> default coder for Pubsub messages instead of the current default of
> PubsubMessageWithAttributesCoder.
>
> Not long ago, support for reading and writing Pubsub messages in Beam
> including an OrderingKey was added[2].  Part of this change involved adding
> a new Coder for PubsubMessage in order to capture and propagate the
> orderingKey[1].  This change illuminated that in cases where the coder type
> for PubsubMessage is inferred, it is possible to accidentally and silently
> nullify fields like MessageId and OrderingKey in a way that is not at all
> obvious to users[3].
>
> So far two potential drawbacks of this proposal have been identified:
> 1. Update compatibility for pipelines using PubsubIO might require users
> to explicitly specify the current default coder (
> PubsubMessageWithAttributesCoder)
> 2. Messages would require a larger number of bytes to store as compared to
> the current default (which could again be overcome by users specifying the
> current default coder)
>
> What other potential drawbacks might there be? I look forward to hearing
> others' input!
>
> Thanks,
> Evan
>
> [1]
> https://github.com/apache/beam/pull/22216/files#diff-28243ab1f9eef144e45a9f6cb2e07fa1cf53c021ceaf733d92351254f38712fd
> [2] https://github.com/apache/beam/pull/22216
> [3] https://github.com/apache/beam/issues/23525
>

Re: [Proposal] Change to Default PubsubMessage Coder

Reply via email to