Re: Pyspark Error when broadcast numpy array

2014-11-12 Thread bliuab
;> > -- >> >> > View this message in context: >> >> > >> http://apache-spark-user-list.1001560.n3.nabble.com/Pyspark-Error-when-broadcast-numpy-array-tp18662.html >> >>

Re: Pyspark Error when broadcast numpy array

2014-11-11 Thread bliuab
t;> > Sent from the Apache Spark User List mailing list archive at > Nabble.com. > >> > > >> > --------------------- > >> > To unsubscribe, e-mail: [hidden email] &g

Re: Pyspark Error when broadcast numpy array

2014-11-11 Thread Davies Liu
>> > To unsubscribe, e-mail: [hidden email] >> > For additional commands, e-mail: [hidden email] >> > >> >> --------------------- >> To unsubscribe, e-mail: [hidden email] >> F

Re: Pyspark Error when broadcast numpy array

2014-11-11 Thread bliuab
Dear Liu: Thank you very much for your help. I will update that patch. By the way, as I have succeed to broadcast an array of size(30M) the log said that such array takes around 230MB memory. As a result, I think the numpy array that leads to error is much smaller than 2G. On Wed, Nov 12, 2014 at

Re: Pyspark Error when broadcast numpy array

2014-11-11 Thread Davies Liu
This PR fix the problem: https://github.com/apache/spark/pull/2659 cc @josh Davies On Tue, Nov 11, 2014 at 7:47 PM, bliuab wrote: > In spark-1.0.2, I have come across an error when I try to broadcast a quite > large numpy array(with 35M dimension). The error information except the > java.lang.N