Hey all,
I wrote up a quick update on the status of replacing dill
https://docs.google.com/document/d/1XypNkB0ujc-U2hy9PuJNYj6asY3tZyGoKvfFDbRux_Y/edit?usp=sharing
.
There is one remaining blocker (dill is used to deterministically encode
special types by default) that I discuss further in
https:
Ah yes, and no more saving the main session :)
> FWIW - I noticed that the DataFlow Options documentation[1] for setting
the pickling library and the Beam documentation
Thanks for bringing it up. The doc is outdated, the issue was fixed in
https://github.com/apache/beam/issues/21615 .
On Wed, Ap
Wow this is fantastic! I tested it out and it worked great for my runner. I
am also excited for this change now and will eagerly set `cloudpickle` as
the default pickler for our code.
FWIW - I noticed that the DataFlow Options documentation[1] for setting the
pickling library and the Beam document
On Tue, Apr 29, 2025 at 7:51 PM Joey Tran wrote:
>
> Does cloudpickle make --save_main_session unnecessary? As in, will more
> transforms defined in __main__ "just work"?
Yes. Or at least it "just works" much more often. (There may still be
corner cases, but I haven't run into them...)
I, for o
Does cloudpickle make --save_main_session unnecessary? As in, will more
transforms defined in __main__ "just work"?
If so, I can see why that's worthwhile. I've had a _ton_ of issues with
this, especially with new users of beam at my company. Explaining main
session and why random things throw unp
There are several reasons:
- wide adoption in data processing community , see initial discussion: [1]
- expectations on cloudpickle having a larger number of maintainers and
contributors.
- new releases of dill had breaking changes[2], which made adoption of a
new version challenging.
- cloudpi
Thanks Claude!
Great to see a lot of progress on this effort. The dependency on an old
version of dill has been a persistent painpoint for many users.
Please call out this change in the release notes, so that customers can
provide feedback and find instructions on how to unblock themselves. It c
Naive question, but why is beam upgrading to cloudpickle?
I saw this doc:
https://docs.google.com/document/d/1G5Q0ckX5sKQRQD1yEkLCPQL7N6B-AL9Cb1p0zlOOfQU/edit?tab=t.0
Is the main reason because cloudpickle is more actively maintained?
On Mon, Apr 28, 2025 at 6:51 PM Claudius van der Merwe
wrot