[ https://issues.apache.org/jira/browse/BEAM-14514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17542503#comment-17542503 ]
Valentyn Tymofieiev commented on BEAM-14514: -------------------------------------------- Hey [~Ryan.Thompson] can you please take a look? > Beam python SDK ignores pickle_library option in pipeline.run() > --------------------------------------------------------------- > > Key: BEAM-14514 > URL: https://issues.apache.org/jira/browse/BEAM-14514 > Project: Beam > Issue Type: Bug > Components: sdk-py-core > Affects Versions: 2.38.0 > Reporter: dctelus > Assignee: Ryan Thompson > Priority: P2 > > Context: > In the Python SDK, you can specify the Pipeline argument --pickle_library > which dictates which library to use to pickle variables to send them from the > executing machine to the workers (when save_main_session is True). > Issue: > pickle_library options is ignored in the pipeline.run() function, which > reverts to using dill (the default one). > https://github.com/apache/beam/blob/master/sdks/python/apache_beam/pipeline.py#L570 > Reproduce: > Add --pickle_library cloudpickle to pipeline options and notice that dill is > used for this session dump, even though cloudpickle is provided. > > I found this out because dill parser throws an exception for my use case, but > cloud pickle doesn't. -- This message was sent by Atlassian Jira (v8.20.7#820007)