[ 
https://issues.apache.org/jira/browse/BEAM-14514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17542503#comment-17542503
 ] 

Valentyn Tymofieiev commented on BEAM-14514:
--------------------------------------------

Hey [~Ryan.Thompson] can you please take a look?

> Beam python SDK ignores pickle_library option in pipeline.run()
> ---------------------------------------------------------------
>
>                 Key: BEAM-14514
>                 URL: https://issues.apache.org/jira/browse/BEAM-14514
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-core
>    Affects Versions: 2.38.0
>            Reporter: dctelus
>            Assignee: Ryan Thompson
>            Priority: P2
>
> Context:
> In the Python SDK, you can specify the Pipeline argument --pickle_library 
> which dictates which library to use to pickle variables to send them from the 
> executing machine to the workers (when save_main_session is True).
> Issue:
> pickle_library options is ignored in the pipeline.run() function, which 
> reverts to using dill (the default one).
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/pipeline.py#L570
> Reproduce:
> Add --pickle_library cloudpickle to pipeline options and notice that dill is 
> used for this session dump, even though cloudpickle is provided.
>  
> I found this out because dill parser throws an exception for my use case, but 
> cloud pickle doesn't.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to