Naive question, but why is beam upgrading to cloudpickle?

I saw this doc:
https://docs.google.com/document/d/1G5Q0ckX5sKQRQD1yEkLCPQL7N6B-AL9Cb1p0zlOOfQU/edit?tab=t.0

Is the main reason because cloudpickle is more actively maintained?


On Mon, Apr 28, 2025 at 6:51 PM Claudius van der Merwe <claud...@vdmza.com>
wrote:

> Hi Beam Devs,
>
> I am making progress on making cloudpickle the default pickling library
> and removing the strict dependency on dill as outlined in
> https://s.apache.org/beam-cloudpickle-next-steps.
>
> The current plan  is to:
>
> 1. Make cloudpickle the default library in Beam 2.65.0 release (see
> https://github.com/apache/beam/pull/34695). Users will be able to specify
> pickle_library='dill' without any additional requirements. There will still
> be a hard dependency on dill (blocked by #2) but it is a step in the right
> direction.
>
> 2. Remove the strict dependency on dill in Beam 2.66.0 release. Dill is
> directly used for coder's encoding types in FastPrimitivesCoderImpl [1][2].
> I prefer to submit a fix for this after the branch cut so we have more time
> to identify any issues.
>
> Coudpickle has some fundamentally different pickling behavior to dill that
> is likely to break:
>
>    -
>
>    Unittests that rely on globals
>    -
>
>       This can be fixed by using apache_beam.utils.shared [3]
>       -
>
>    Closures and dynamic classes that reference unpicklable globals
>    -
>
>       This can be fixed by defining functions in the top level, and using
>       functools.partial to bind parameters if necessary
>
>
> [1]
> https://github.com/apache/beam/blob/b9fa49a9827dd28349e382f479ebd1a8bbe27d07/sdks/python/apache_beam/coders/coder_impl.py#L529
>
> [2]
> https://github.com/apache/beam/blob/b9fa49a9827dd28349e382f479ebd1a8bbe27d07/sdks/python/apache_beam/coders/coder_impl.py#L595
>
> [3]
> https://github.com/apache/beam/blob/b9fa49a9827dd28349e382f479ebd1a8bbe27d07/sdks/python/apache_beam/internal/cloudpickle_pickler_test.py#L54
>
>
> I'd appreciate any feedback or concerns.
>
>
> Best,
>
> Claude
>
>

Reply via email to