Thanks Holden and Hyukjin. I agree, let's start doing the work first and
see if it the changes are low risk enough, then we can evaluate how best to
proceed. I made https://issues.apache.org/jira/browse/SPARK-23159 and will
get started on the update and we can continue to discuss in the PR.
On F
Yea, that sounds good to me.
2018-01-19 18:29 GMT+09:00 Holden Karau :
> So it is pretty core, but its one of the better indirectly tested
> components. I think probably the most reasonable path is to see what the
> diff ends up looking like and make a call at that point for if we want it
> to go
So it is pretty core, but its one of the better indirectly tested
components. I think probably the most reasonable path is to see what the
diff ends up looking like and make a call at that point for if we want it
to go to master or master & branch-2.3?
On Fri, Jan 19, 2018 at 12:30 AM, Hyukjin Kwo
> So given that it fixes some real world bugs, any particular reason why?
Would you be comfortable with doing it in 2.3.1?
Ah, I don't feel strongly about this but RC2 will be running on and
cloudpickle's quite core fix to PySpark. Just thought we might want to have
enough time with it.
One worry
On Jan 19, 2018 7:28 PM, "Hyukjin Kwon" wrote:
> Is it an option to match the latest version of cloudpickle and still set
protocol level 2?
IMHO, I think this can be an option but I am not fully sure yet if we
should/could go ahead for it within Spark 2.X. I need some
investigations including th
> Is it an option to match the latest version of cloudpickle and still set
protocol level 2?
IMHO, I think this can be an option but I am not fully sure yet if we
should/could go ahead for it within Spark 2.X. I need some
investigations including things about Pyrolite.
Let's go ahead with matchin
So if there are different version of Python on the cluster machines I think
that's already unsupported so I'm not worried about that.
I'd suggest going to the highest released version since there appear to be
some useful fixes between 0.4.2 & 0.5.2
Also lets try to keep track in our commit messag
Thanks for all the details and background Hyukjin! Regarding the pickle
protocol change, if I understand correctly, it is currently at level 2 in
Spark which is good for backwards compatibility for all of Python 2.
Choosing HIGHEST_PROTOCOL, which is the default for cloudpickle 0.5.0 and
above, wil
Hi Bryan,
Yup, I support to match the version. I pushed it forward before to match it
with https://github.com/cloudpipe/cloudpickle
before few times in Spark's copy and also cloudpickle itself with few
fixes. I believe our copy is closest to 0.4.1.
I have been trying to follow up the changes in c
Hi All,
I've seen a couple issues lately related to cloudpickle, notably
https://issues.apache.org/jira/browse/SPARK-22674, and would like to get
some feedback on updating the version in PySpark which should fix these
issues and allow us to remove some workarounds. Spark is currently using a
fork
10 matches
Mail list logo